VSAN Node Removal Disk Cleansing

Hyperconvergence is great and now gives us the ability to not be dependent on a centralized storage array (SAN) like we’ve all been accustomed to. Adding nodes to
scale our VSAN environment is simple to do (depending on your switch infrastructure) and can be done on the fly.

Removal of those nodes is a fairly straight forward process but there are some cleanup steps that need to be done in order to repurpose those nodes for other environments.
You can validate that your node is no longer part of Virtual SAN Clustering by issuing the following command:

[root@machinename:~] esxcli vsan cluster get
You should see “Virtual SAN Clustering is not enabled on this host” if the host has been properly removed from Virtual SAN.

Ok? Great, you can now proceed to the fun part.

Warning: Make sure that your VSAN node is out of the cluster and that you’ve remove the vmkernel interface from the host!

Once a system has been removed from the VSAN cluster, the storage is no longer
considered as a “part” of the array. But, in order for you to reuse those nodes
for something else, you need to clean up the remnants of what VSAN has left
behind on those local disks.

If you attempt to put that node into an new or different environment, you’ll
quickly notice that none of the disks are usable since they still have the old
VSAN partition information on them. We can find this out by issuing the following
command on the node via the console/ssh session:

[root@machinename:~] esxcli vsan storage list

This produces a list of all the disks that are claimed and you will see each one
with a “naa.xxxxxxxxx” format as the identifier. You’ll also notice that each one
is identified as being SSD or HDD through the line “Is SSD:” with a value of true
or false.

Cleaning these drives can be done by collecting the identifier of the disk and
issuing the following command for removal:

[root@machinename:~] esxcli vsan storage remove –disk=naa.xxxxxxx <where xxxxxxx
is the identifier of your disk>

This is done for spinning disks only. If you want to remove an SSD, you need to
use the –ssd switch to specify. Again, you can derive this information from the
storage list mentioned above.

The Shortcut!

In a hybrid VSAN environment, there is a requirement for at least one SSD for each
disk group. When the disk groups are formed, the spinning drives are “tied” to the
SSD drive and are dependent on it while the node is standalone.

This can be used to our advantage when doing disk cleaning since we don’t have to
remove each HDD individually and can simply call out the SSD in the vsan storage
remove command! Simply collect all “IS SSD: true” disks from your storage list
output and issue the esxcli vsan storage remove –ssd=naa.xxxxxxx command!

Bam! Now all disks are free from old partitioning and ready to use in your new
environment.

P.S. – You’ll probably notice that the datastore name sticks on the hosts that you
just removed (regardless of the fact that the disks have been removed). This seems
to be cosmetic since a creation of a new VSAN cluster replaces this with the
default “vsanDatastore”.

~Rick

Hedvig Overview

I was part of the Storage Field Day 10 group last week and had a chance to visit Hedvig at their new offices in Santa Clara, CA. Lots of space to grow into here and they have a nice friendly atmosphere like most places we visited.

The founder and CEO (Avinash Lakshman), spent 6-8 years building large scale, distributed systems. He was one of the co-inventors of Amazon Dynamo and was part of the Apache Cassandra project at Facebook. He believes that the state of traditional storage will disappear and from what I’ve seen at this presentation, they are building that next generation storage platform for tomorrow’s workload.

SFD10 Hedvig Welcome

Founded in 2012 and with a product launch in April of this year, you can see that they have had some time to adjust their product for what the market is demanding. The operational model is focused on a policy based engine that is defined by the infrastructure.

Hedvig is software that is decoupled and residing on commodity servers equaling their distributed storage platform.

One thing that was talked about early in the presentation was the fact that most of their customers don’t even use the user interface since Hedvig’s platform is architected to be API driven. That should give you a good idea what type company is looking at this deployment model.

If you look atHedvig_reception the way they are scaling out their storage architecture (through the multi-site architecture), you can see that they have regional protection in mind from the start. This is accomplished through their “container” based storage model and it’s not the containers that you’re thinking of (read part two).

The software can be deployed within a private datacenter or in a public cloud location or together that would classify it as a hybrid architecture.

High Level Overview:

  1. I found it very interesting that they have prepared the platform for both x86 and ARM based processors. They noted that they have had some interest from some large customers that low power ARM-based deployments are being looked at for some deployments.
  2. They have support for any hypervisor that is out on the market today as well as native storage provisioning to containers.
  3. Block (iSCSI), file (NFSv3 and v4) and object (S3 & Swift) protocol support.
  4. Deduplication, compression, tiering, caching and snaps/clones.
  5. Policy driven storage that provides HA or DR on a per-application basis.

How its Deployed: 

  • The storage service itself is deployed on bare-metal servers or cloud based infrastructure (as mentioned above).
  • It is then presented as file and block storage through a VM/Container/Bare-Metal mechanism called a storage proxy. They have a proprietary “network optimized” protocol that talks to the underlying storage service.
  • For object based storage, it talks natively to the service through the RESTful API’s via S3 or Swift and does not go through the storage proxy converter.

What happens at the Storage Service Layer:

  • When the writes reach the cluster, the data is distributed based on a policy that is pre-configured for that application. (This also contains a replication element)
  • In addition to this, there are background tasks that balance the data across all nodes in the cluster and caches the data for reads.
  • The data is then replicated to multiple datacenters or cloud nodes for DR purposes through synchronous and asynchronous replication.

Look for part two that goes a bit deeper on the intricacies of the Hedvig platform.

 

Post Disclaimer: I was invited to attend Storage Field Day 10 as an independent participant. My accommodations, travel and meals were covered by the Tech Field Day group but I was not compensated for my time spent attending this presentation. My post is not influenced in any way by Gestalt IT or Hedvig and I am under no obligation to write this article. The companies mentioned above did not review or edit this content and it is written from purely an independent perspective.

 

Storage Field Day 10 Next Week

Yep, next week is Storage Field Day 10 (also known as #SFD10 on Twitter).

As always, this looks like an action packed event filled with three days of non-stop meetings with companies in Silicon Valley. I’ve been to a number of the Tech Field Day events in the past and there really is no event quite like it. The presentations are extremely technical in nature and we, the delegates are encouraged to ask the tough questions about the product(s) so that the viewers get a real understanding about them.

Here is a quick lineup of the presenters, their products and what I am looking to derive from these meetings:

Kaminario kicks off the event Wednesday morning at 9:30 PST and I am very interested to hear more about their scale up and out architecture. Seems like a unique design centered around a pair of controllers. I missed the presentation at SFD7 but have been watching this company for a while now.

Primary Data is up at 12:30 PST and I’ve had the pleasure of talking to them in the past. I am interested in hearing more about this software solution (DataSphere) that allows customers to use their own storage hardware (such as NetApp, Nexenta or EMC Isilon). One other item that I want to cover with them is their non-disruptive data movement technology, so look for a blog post that focuses on that in the near future.

Cloudian wraps up the day at 3:00pm PST with what I hope will be an in-depth discussion about their HyperStore appliance. Doing a little research on the FL3000 uNode Server Chassis has me intrigued with a design that fits eight servers in 3U’s of space and how that is leveraged with the solution.

Pure Storage kicks off Thursday at 9:30 PST at their headquarters. I’ve had the pleasure of meeting with this company many times in the past and the discussions are always extremely technical in nature. I’ve also deployed this product in the past and found the administration of the array to be remarkably simple. The solution that I hope we get to discuss in detail is their “Flashblade” product which I’ve done a fair amount of reading on and aim to dig into some details on the modular blade format of this product.

We then head over to Datera is up next at 1:00 PST with their Elastic Data Fabric that focuses on Openstack, Container, Service Provider and Data Store solutions with a self-optimizing product that is infrastructure aware and utilizes commodity hardware to provision storage on. According to their website, they have centered their product with DevOps in mind. This should be a fun conversation to have.

Next up on Wednesday we visit Tintri at their HQ building to talk about the All-Flash, Hybrid-Flash Arrays with an analytics & management software package that can manage 32 arrays. Software lineup includes ReplicateVM, SyncVM, SecureVM, Automation Toolkit, vCenter Web Client Plug-in, Management Pack for vROps as well as OpenStack support.

Nimble kicks off Friday at 8:00am PST and I am looking forward to catching up with their latest product lineup. I’ve been following them pretty close over the five years or so and am impressed by the rapid growth f this company. They have a interesting product lineup that includes the all flash arrays, adaptive flash arrays, their Infosight Predictive Analytics and hoping to learn more about the SmartStack Integrated Infrastructure product while I’m there.

Hedvig is online at 10:30am PST to discuss their “One Platform for any application”. This is a company that has produced a “unified storage solution” that supports block, file and object based data. Doing some initial research on the company and they have a full set of API’s that are developer focused as well. Really anxious to learn more about the Hedvig distributed storage platform and really understand their storage service layer.

Exablox finalizes the day at 1:30pm PST and to be honest, I haven’t looked at them in quite some time. Doing some initial fact finding on them I found that the “OneBlox” architecture is quite unique with a ring based scale-out design. I am most interested in how this NFS datastore performs as you start growing and removes the limitations.

You can catch all the action live through the video stream located here:

Rick