Category: Storage

VSAN Node Removal Disk Cleansing

Hyperconvergence is great and now gives us the ability to not be dependent on a centralized storage array (SAN) like we’ve all been accustomed to. Adding nodes to
scale our VSAN environment is simple to do (depending on your switch infrastructure) and can be done on the fly.

Removal of those nodes is a fairly straight forward process but there are some cleanup steps that need to be done in order to repurpose those nodes for other environments.
You can validate that your node is no longer part of Virtual SAN Clustering by issuing the following command:

[root@machinename:~] esxcli vsan cluster get
You should see “Virtual SAN Clustering is not enabled on this host” if the host has been properly removed from Virtual SAN.

Ok? Great, you can now proceed to the fun part.

Warning: Make sure that your VSAN node is out of the cluster and that you’ve remove the vmkernel interface from the host!

Once a system has been removed from the VSAN cluster, the storage is no longer
considered as a “part” of the array. But, in order for you to reuse those nodes
for something else, you need to clean up the remnants of what VSAN has left
behind on those local disks.

If you attempt to put that node into an new or different environment, you’ll
quickly notice that none of the disks are usable since they still have the old
VSAN partition information on them. We can find this out by issuing the following
command on the node via the console/ssh session:

[root@machinename:~] esxcli vsan storage list

This produces a list of all the disks that are claimed and you will see each one
with a “naa.xxxxxxxxx” format as the identifier. You’ll also notice that each one
is identified as being SSD or HDD through the line “Is SSD:” with a value of true
or false.

Cleaning these drives can be done by collecting the identifier of the disk and
issuing the following command for removal:

[root@machinename:~] esxcli vsan storage remove –disk=naa.xxxxxxx <where xxxxxxx
is the identifier of your disk>

This is done for spinning disks only. If you want to remove an SSD, you need to
use the –ssd switch to specify. Again, you can derive this information from the
storage list mentioned above.

The Shortcut!

In a hybrid VSAN environment, there is a requirement for at least one SSD for each
disk group. When the disk groups are formed, the spinning drives are “tied” to the
SSD drive and are dependent on it while the node is standalone.

This can be used to our advantage when doing disk cleaning since we don’t have to
remove each HDD individually and can simply call out the SSD in the vsan storage
remove command! Simply collect all “IS SSD: true” disks from your storage list
output and issue the esxcli vsan storage remove –ssd=naa.xxxxxxx command!

Bam! Now all disks are free from old partitioning and ready to use in your new
environment.

P.S. – You’ll probably notice that the datastore name sticks on the hosts that you
just removed (regardless of the fact that the disks have been removed). This seems
to be cosmetic since a creation of a new VSAN cluster replaces this with the
default “vsanDatastore”.

~Rick

Hedvig Overview

I was part of the Storage Field Day 10 group last week and had a chance to visit Hedvig at their new offices in Santa Clara, CA. Lots of space to grow into here and they have a nice friendly atmosphere like most places we visited.

The founder and CEO (Avinash Lakshman), spent 6-8 years building large scale, distributed systems. He was one of the co-inventors of Amazon Dynamo and was part of the Apache Cassandra project at Facebook. He believes that the state of traditional storage will disappear and from what I’ve seen at this presentation, they are building that next generation storage platform for tomorrow’s workload.

SFD10 Hedvig Welcome

Founded in 2012 and with a product launch in April of this year, you can see that they have had some time to adjust their product for what the market is demanding. The operational model is focused on a policy based engine that is defined by the infrastructure.

Hedvig is software that is decoupled and residing on commodity servers equaling their distributed storage platform.

One thing that was talked about early in the presentation was the fact that most of their customers don’t even use the user interface since Hedvig’s platform is architected to be API driven. That should give you a good idea what type company is looking at this deployment model.

If you look atHedvig_reception the way they are scaling out their storage architecture (through the multi-site architecture), you can see that they have regional protection in mind from the start. This is accomplished through their “container” based storage model and it’s not the containers that you’re thinking of (read part two).

The software can be deployed within a private datacenter or in a public cloud location or together that would classify it as a hybrid architecture.

High Level Overview:

  1. I found it very interesting that they have prepared the platform for both x86 and ARM based processors. They noted that they have had some interest from some large customers that low power ARM-based deployments are being looked at for some deployments.
  2. They have support for any hypervisor that is out on the market today as well as native storage provisioning to containers.
  3. Block (iSCSI), file (NFSv3 and v4) and object (S3 & Swift) protocol support.
  4. Deduplication, compression, tiering, caching and snaps/clones.
  5. Policy driven storage that provides HA or DR on a per-application basis.

How its Deployed: 

  • The storage service itself is deployed on bare-metal servers or cloud based infrastructure (as mentioned above).
  • It is then presented as file and block storage through a VM/Container/Bare-Metal mechanism called a storage proxy. They have a proprietary “network optimized” protocol that talks to the underlying storage service.
  • For object based storage, it talks natively to the service through the RESTful API’s via S3 or Swift and does not go through the storage proxy converter.

What happens at the Storage Service Layer:

  • When the writes reach the cluster, the data is distributed based on a policy that is pre-configured for that application. (This also contains a replication element)
  • In addition to this, there are background tasks that balance the data across all nodes in the cluster and caches the data for reads.
  • The data is then replicated to multiple datacenters or cloud nodes for DR purposes through synchronous and asynchronous replication.

Look for part two that goes a bit deeper on the intricacies of the Hedvig platform.

 

Post Disclaimer: I was invited to attend Storage Field Day 10 as an independent participant. My accommodations, travel and meals were covered by the Tech Field Day group but I was not compensated for my time spent attending this presentation. My post is not influenced in any way by Gestalt IT or Hedvig and I am under no obligation to write this article. The companies mentioned above did not review or edit this content and it is written from purely an independent perspective.

 

Reduxio Presentation at VMworld 2015

While I was attending VMworld 2015,  I attended the Reduxio presentation at Tech Field Day. I did some preliminary research on this company prior to the presentation and to be honest I did not completely understand their use case, so I was very interested in hearing more technical content from their engineers.

For the presentation, the company had Eyal Traltel (Director of Technical Marketing), Jacob Cherian (VP of Product Strategy and Product Management) as well as Nir Peleg who is the founder and CTO on site.

Funding

Reduxio has four primary backers. Seagate, Intel, JVP and Carmel (which is part of Viola Group).

First Impressions

At first I thought to myself: Ok, here’s another hybrid array but I quickly saw the value-add with the type of protection that they’ve implemented within this midrange storage system.

On the first slide of the presentation they had a statement that read, “A new way to protect data” which immediately sparked my interest. Here is a storage company that put data protection on the top of their list!

Hardware Detail

The Reduxio HX550 consumes 2U of rack space and has the ability to hold 24 drives. They mentioned that generally they have 16 HDD’s and 8 SSD’s in a standard build.

The HDD’s are 7200 RPM drives and the SSD’s are Western Digital eMLC SAS drives. The unit holds 40TB of raw data, but Reduxio said that capacity would be higher when data reduction is factored in. Also, it is a dual controller system with dual 10GB ports on each controller.

Product Operation

TimeOS™ Metadata

This is very different from what others are doing and is what makes this storage array so unique. TimeOS™ tracks every read/write operation (instead of allocating space) which enables the volumes for the following technologies:

Tier-X™

This is a reactive tiering operation that operates at the block level, is automated and is continually running. It is designed to place data on flash first and moves infrequently accessed data to slower magnetic disks through the analysis of 8k blocks.

BackDating™

This is Reduxio’s method of crash consistent protection for applications on all LUN’s created and provisioned in the unit. It allows for recovery within one second of the failure anywhere in the volume. Essentially, giving you a view to the volume “the way it was” before the failure occurred. I did ask some specific questions revolving around always on clustering from Microsoft and even though they haven’t tested it to date, it is on their radar.

Nodup™

This is their deduplication & compression, it is a global setting and is always running in the array on all cache tiers, volumes and clones, as well as current and historical data. It is also real time and in-line!

The Interface

The entire management interface is in HTML5 and during a demonstration of its operation, it was very smooth and fluid.

HTML5 – no javascript or flash and is designed for a touch interface.  During the demo phase of the presentation, the interface is very animated and visually pleasing. For example, when you search a volume and make time selections for restores a pop-out is presented which allows you to focus in on that task.

Final Thoughts

  • All volumes are protected by default through policy based retention.
  • The vCenter plugin has a nice look and feel to its operation.
  • To date, the system does not scale out but they have that in mind for the future since it is designed for scale-out addressing.