For the past 9 years, I have had the privilege of working for a fast growing, dynamic, and fun environment. I started as a summer intern and worked hard to assume full design and implementation for all IT infrastructure. I was provided with experiences I never would have thought possible… including significant travels in Asia. Management fostered an environment in which I could grow personally, as an IT professional, and in my blogging/social media life.
However, sometime earlier this year, I was approached with another opportunity by an outside group. The opportunity provided was significant and very appealing… tugging at my virtualization heart-strings. Suddenly, I was put into a position where I really needed to decide which direction I wanted to take my IT career. Continue on the IT generalist route (broad and deep in a handful of areas) or specialize more in virtualization (narrow but significantly deeper in the virtualization core pillars).
After much debate, list making, and lost sleep, I decided to take a risk and accept the offer for the new position.
Leaving my post at a company I have grown with and that has grown with me was tremendously difficult. The company fostered a family-like environment and I genuinely like everyone I work with. The company is full of rock stars and growing like a weed (but, a good weed that will turn into something cool in the future). But, I would be remiss to not take advantage of this new opportunity.
The new opportunity puts me into a Senior role working with a truly enterprise-level environment… at a scale that is just mind boggling (almost 2 physical servers for every 1 employee at my former employer) The prospect of the position is exciting and I cannot wait to start there. Plus, I am going to be working in close proximity to some VMware/virtualization rock stars (a couple blocks apart) and one of my fellow PDX VMUG leaders.
I am really going to miss my former employer and opportunities there, though. This is just a blip on their radar. I am positive the IT department is going to step up and make the infrastructure their own, just like I feel I was able to while there. But, I look forward to meeting everyone at the new job and getting deeper and dirtier into VMware virtualization!
PS – I apologize for the vague-ness of the post. However, I try and make a point of not calling out my employer on my blog… which the old company and new company appreciate. So, I hope you were able to follow along.
Let’s be honest, x86-based compute in virtualization environments is pretty darn boring. A server has become the necessary evil required for enabling the coolness that is virtualization.
But, don’t let the boringness of servers fool you. VMware has enabled a new breed of hybrid servers that are both server AND storage all-in-one! This new paradigm adds some new methods and models for virtualization design and functionality.
Conceptually, the server boots into an ESXi environment and fires up a guest OS. This guest OS is the virtual storage appliance and provides the storage for the local server. The guest makes use of VMDirectPath functionality to take control of a locally installed storage controller connected to the local disks. The result of this is that the VM can access the local disks and ESXi will not. The local disk is now directly connected to the VM. How cool is that?!
Once the guest OS has the disks, the guest creates various storage options: block or file, object or RAID, etc…). The ESX host is, then, configured to connect to IP storage provided by the guest. The first, typical reaction may be to wonder about the reason to add this level of complexity. For a standalone host with local storage (think ROBO) this may be a little overkill. But, the advantage comes into play when you consider flexibility and new functionality.
By moving control of local storage into the VM, more advanced functions can be performed. Local storage use by ESXi is fairly limited. The VSA, though, can use the storage a little more liberally.
Take Pivot3, for example. Their VDI and surveillance solutions make use of this storage technique. The vSTAC OS (the Pivot3 VSA) creates a RAID across the local disks. Yawn, right?! Where the coolness is applied is when multiple nodes are "connected". vSTAC OS instances on other Pivot3 servers combine and RAID across multiple hosts. Suddenly, local storage is combined with local storage from other hosts and creates a big clustered pool of available storage! This cluster environment allows for added resiliency and performance as the data is no longer restricted to the local host and distributed to help against local storage issue.
Once the vSTAC OS nodes connect their storage together, data is spread across all of the other nodes to immediately protect the data and enhance performance. A new node can be added in the future. Once the new node is added, the data is automatically rebalanced across all hosts to ensure proper protection and efficient usage of the storage. Dynamic add of storage and compute is fantastic!
The VSA VM can perform additional functions if desired (and developed as such) like: deduplication, replication, compression, etc…
I love this type of innovation. There are many use cases for solutions like this. The Pivot3 solution has a lot of potential for success in their target markets. I have concern about the selection of RAID versus object storage, though… but that is their decision. Traditional RAID5 systems suffer heavily from a disk failure and rebuild… the performance tanks until the failed disk has been replaced. In the event of a failure in the Pivot3 solution, the entire solution may suffer until the offending disk has been replace. But, with that said, I believe the benefits of the technique outweigh the potential performance hit.
This style architecture really bucks the trend of needing a separate SAN/NAS in addition to compute. Adding sophistication to the VSA component and introducing more SSD/Flash-based storage could create an interesting and valid competitor to traditional SAN/NAS solutions and breathes new life into boring servers.
At the end of 2011, Tintri approached me to participate in their developing blogging movement at blog.tintri.com.
The appealing part to my participation is that the blogging effort is not directed towards focus or emphasis on Tintri as a company or the products they offer. Rather, the emphasis is on developing a meaningful community around the virtualization ecosystem. If I never mention Tintri in a single post, that is alright as long as I am providing meaningful content that the virtualization community finds useful and valuable.
Additionally, I am being paid for my efforts, which is greatly appreciated.
Disclosure is important for work like this because I am also affiliated with the Tech Field Day events and it is possible my TFD path will cross with Tintri… So, I want to make my relationship with Tintri clear ahead of any path-crossing that may happen in the future.
I would like to take a moment to thank Tintri for reaching out to me… I really appreciate it.
With all the wicked-cool new functions in vSphere 5, one of the most understated but highly functional lies with the ability to unmount an iSCSI share. Seemingly a simple function, this has not been available in non-vSphere 5 hosts until now.
The problem I have faced in the past is that there is a need to remove iSCSI stores from an ESXi host. In those rare instances, I have needed to migrate some VMs off of a SAN while keeping other VMs on the same SAN (ex: moving a development SAN to another site). svMotion handles the hard work of moving the VMs to the new datastores (easy-peasy, right?). However, unlike an NFS share, a VMFS share could not be unmounted. I ran into 2 options to remove the share:
1) Right-click the datastore and select “Delete”!
Uh… the point of this is to not delete these VMs!
2) Remove the initiator IP address, remove access to the ESXi host initiators via the SAN interface, vMotion VMs to other hosts (if you’re lucky), and reboot the host.
– Host downtime, SAN maintenance (which, yes, I know initiators not being used should be cleaned up… but not as a requirement to save my VMs), host downtime, etc… I can add a datastore live, why not remove it live?!
To my surprise this morning, while removing some iSCSI stores after some over-the-weekend SAN migration, I was presented with a new option via vSphere 5!
Following this new function leads me to a functional check to ensure that the unmount requirements are green and good to go:
Now, the downside to this procedure is that in my environment, I have a couple non-DRS clustered hosts (thank you Oracle VMware licensing) that I am unable to take offline to upgrade to ESXi 5.0 right now. So, the same iSCSI volumes are available on both ESXi 4.1 and 5.0 hosts. Thus, the unmount process is only partially useful. Due to those darn ESXi 4.1 hosts, I still need to delete the datastore to get rid of the iSCSI volume!
Thanks Oracle Licensing!
Lucky for me, I do not have any VMs to save on the datastore!
This was a great way to start a Monday morning! I look forward to being able to unmount VMFS volumes as necessary… once everything is up to vSphere 5.0!
The other day (Nov 16, 2011 to be exact), my fellow nerd and Tech Field Day delegate, Tom Hollingsworth crafted a great blog post on the new movement in IT, and business in general… Bring Your Own (Apple) Device to work. If you have not read the post yet… you gotta check it out.
This is Tom. Ask him about NAT!
After reading the post, I had some thoughts come to mind that I just had to throw into a reaction post.
As new generations of individuals grow up and mature, it is expected that cultural shifts will take place. What I do not understand is how a culture of technological availability has morphed into an expectation that an individual can bring anything into the corporate environment and expect to use it for their job.
Too many times, I am approached by users bringing their personal laptop into the office and wanting to know how to connect it to the internal network. Or, users that want to connect their iPhones to the network so they can use Spotify or YouTube without using their cellular data plans… as though the corporate infrastructure and services are there to do their bidding.
This new culture developing assumes that everything in the outside world must be the same as in the corporate world. Their iPad can connect to GMail, so why not just connect it to the Exchange server?
What the user sees. What IT sees!
The IT ecosystem is a carefully designed and tightly guarded world.
None shall pass!
Systems are selected carefully to ensure a proper balance between functionality, supportability, and stability. The discovery of an unknown device is enough to throw an IT professional into a fit of rage. The environment has been compromised in some fashion and there is potential to throw off the carefully designed balancing act.
The presence of an unknown device opens up a venerable Pandora’s Box and raises a huge red flag. Suddenly, the corporate environment is now vulnerable to a machine or device infested with trojans, a honeypot of virus infections, access to corporate resources, and not managed by IT.
IT has been assigned a critical role in modern businesses… provide tools that enable the business to function. Traditionally, this included the workstation, network, monitors, servers, etc… With more people feeling as though it is acceptable to provide their own devices, who is responsible for supporting them? What happens when the “S” key breaks off or the monitor is too blue for their liking. When IT owns and manages a device, IT is responsible. When the users owns the device, but is using it in a corporate environment, the answer is much foggier. An IT persons says the user is responsible. However, the true answer lies somewhere in the depths of politics and policy.
Unknown devices also introduce the loss of data control. The moment a user is allowed to bring in a USB drive, iPod, access GMail, or Dropbox, the data is no longer under any control of the company.
Corporate IT Adaptation
First and foremost, IT has a responsibility to the company to ensure the protection and function of corporate technological resources and systems.
However, with that said, IT needs to acknowledge the changing ways of technology. Anyone who has been in IT longer than 1 month knows that times have a way of changing and the minute you buy your phone, it is obsolete. That is the way of the world and 42 is the ultimate answer to the ultimate question of live, the universe, and everything.
Is Google “Deep Thought”?
IT departments need to be cognoscente of what exists in the marketplace, impacts (both positive and negative) to overall productivity/security, and the long term viability of those entities. A tablet, for example, may seem like a large phone (cough iPad cough). However, for an executive that spends more time meeting customers and reading email, it is a perfect tool to enable them to get their job done without needing a laptop… but how is it secured?
Security becomes one of the most important concerns for IT in a time where users have expectation of providing their own devices. NAC/NAP/Port Security ensures authorized devices are allowed on the network. Remote technologies (Application Presentation (XenApp/RemoteApp) and VDI (View, XenDesktop)) allow users to interact with applications running on protected and trusted infrastructure from unknown endpoints. Proper backups, snapshotting, and antivirus on the server and storage side ensure the data consistency is proper and recoverable in the event of a break in security.
Finally, IT needs to engage with the business to keep them abreast of concerns. Open dialogue with the business will help ensure technological expectations meet some sort of equilibrium between what IT feels is appropriate and what the business feels is necessary.
What do you really think, Bill?!
I whole heartedly do not like the idea of users bringing in their own devices for business use. Maybe I am cruisin for a bruisin (politically speaking), but I see my environment as known and trusted. The introduction of a new device takes some planning and testing because I have a responsibility to the company to provide a stable and operational environment. The introduction of a Mac laptop into my environment is not smooth. Exchange and SharePoint support is so horrible that Mac users need to use a Fusion VM running Windows 7 to fully function.
However, while it is possible to be completely restrictive and be more like “The Man”, I feel that the best way to manage the user owned devices converging on my environment is more political.
– I encourage the business to adapt corporate policies addressing the need to not bring personal devices into the environment.
– I encourage the business to develop a stricter definition of who needs email outside of the office, partial compensation for use of personal devices OR providing a company owned and managed phone, and which devices are supported.
– Have an open and friendly dialogue with those users that approach IT for assistance with personal devices. Being honest and frank about not supporting devices, needing management approval, and being unsure as to the functionality/operation of the device goes a long way.
I love the idea of new devices and new technology in the workplace. But, I want the introduction to be more structured and tested.
Tom – Thanks for the awesome post. Definitely food for thought and got my wheels spinning!
I have to say, I am quite shocked that I am on the tail end of waiting 1.5 hours for an ESXi 5.0 upgrade to complete booting. Seriously… 1.5 hours.
I have been waiting for some time to get some ESXi 5.0 awesomeness going on in my environment. vCenter has been sitting on v5 for some time and I have been deploying ESXi 5 in a couple stand-alone situations without any issues. So, now that I have more compute capacity in the data center, it is time to start rolling the remaining hosts to ESXi 5… or so I thought!
I downloaded ESXi 5.0.0 Kernel 469512 a while back and have been using that on my deployments. So far, so good… until today. Update Manager configured with a baseline –> Attach –> Scan –> Remediate –> back to business. Surely, Update Manager processes should take more time than the actual upgrade. About 30 minutes after starting the process, vCenter was showing that the remediation progress was a mere 22% complete and the host was unavailable. I used my RSA (IBM’s version of HP ILO or Dell DRAC) to connect to the console. Sure enough, it was stuck at loading some kernel modules. About 20 minutes later IT WAS STILL THERE!
Restarting the host did not resolve the issue. During the ESXi 5 load screen, pressing Alt + F12 loads the kernel messages. It turns out that iSCSI was having issues loading the datastores in an acceptable amount of time. I was seeing messages similar to:
A little research turned me onto the following knowledgebase article in VMware’s KB: ESXi 5.x boot delays when configured for Software iSCSI (KB2007108)
This issue occurs because ESXi 5.0 attempts to connect to all configured or known targets from all configured software iSCSI portals. If a connection fails, ESXi 5.0 retries the connection 9 times. This can lead to a lengthy iSCSI discovery process, which increases the amount of time it takes to boot an ESXi 5.0 host.
So, I have 13 iSCSI stores on that specific host and multiple iSCSI VMkernel Ports (5). So, calling the iSCSI lengthy is quite the understatement.
The knowledgebase states that the resolution is applying ESXi 5.0 Express Patch 01. Fine. I can do that. And… there is a work around described in the article that states you can reduce the number of targets and network portals. I guess that is a workaround… after you have already dealt with the issue and the ridiculously long boot.
Finally, to help mitigate the issue going forward, VMware has released a new .ISO to download that includes the patch. However, this is currently available in parallel with the buggy .ISO ON THE SAME PAGE! Seriously. Get this… the only way to determine which one to download is:
As a virtualization admin, I know that I am using the Software iSCSI initiator in ESXi. But, why should that even matter at all?! There is a serious flaw in the boot process in version 469512 and that should be taken offline. Just because someone is not using Software iSCSI at the current time does not mean they are not going to in the future. So, if they download the faulty .ISO, they are hosed in the future. Sounds pretty crummy to me!
I am quite shocked that this made it out of the Q/A process at VMware in the first place. My environment is far from complex and I expect that my usage of the ESXi 5.0 hypervisor would be within any standard testing procedure. I try to keep my environment as vanilla as possible and as close to best practices as possible. 1.5 hours for a boot definitely should have been caught before release to the general public.
Additionally, providing the option to download the faulty ISO and the fixed ISO is a complete FAIL! As mentioned on the download page, this is a special circumstance due to the nature of the issue. I would expect that if this issue is as serious as the download page makes it out to be, the faulty ISO should no longer be available. There has to be a better way!
I have since patched the faulty ESXi 5.0 host to the latest/safest version, 504890, and boot times are back to acceptable. I will proceed with the remainder of the upgrades using the new .ISO and have deleted all references to the old version from my environment.
I have never run into an issue like this with a VMware product in my environment and I still have all the confidence in the world that VMware products are excellent. In the scheme of things, this is a bump in the road.
Sunday evening, many of the vExpert award recipients converged in the Casanova 503 room at the Venetian for the VMworld 2011 vExpert meeting. Mingling, meeting, and networking was fantastic.
However, there was one topic of significant discussion that really got my wheel spinning. While we were requested not to go into detail into what was said by VMware (proper), we all are familiar with the concept… the Virtual Datacenter.
It should be no surprise that VMware has been walking us down the path of virtualizing our datacenter components. Servers, storage, networking… the entire stack. All in an effort to create this nebulous “Virtual Datacenter”. But, what is the virtual datacenter and how do we get there? Well… if I had the answer, I would probably be working for VMware… right?!
Conceptually, the virtual datacenter is being comprised of increasingly more and more commoditized resources. x86 compute resources are readily available with minimal cost. Auto-tiering storage is becoming more and more prevalent to help mitigate IO performance. 10Gb networking, and other high-bandwidth connections, are providing the ever-so-necessary connection to networking and network-based storage. By abstracting these resources, the virtual administrator is no longer tasked with management of these resources.
The fact of the matter, though, is that in many environments, management of these resources still exists. We need the network guys to maintain the network, the storage guys to handle the storage, and the server guys to handle the server hardware and connections to systemic resources.
Fact of the matter is that the virtual datacenter still needs management from different facets of the IT house.
My view of the virtual datacenter is creation of a system where network, storage, and servers are all managed at a single point. We are seeing this come to fruition in the Cisco UCS, vBlock, and other single SKU solutions. That is a fantastic model. However, it targets a different market.
My dream virtual datacenter manages everything itself.
- Need more storage, just add a hard drive. The datacenter handles data management and availability. Seriously, just walk over and add a hard drive or add another storage node to the rack.
- Need more network bandwidth, hot-add more pNICs. The datacenter handles properly spreading the data across available links, NIC failures, etc…
- Need more compute resources, add a new server to the rack. The datacenter handles joining the server to the available compute resources.
- Need external resources, just point the datacenter towards a public provider and let the datacenter manage the resources.
Creating the foundation to make this work relies on all parties involved allowing the datacenter to configure and manage everything. Storage vendors need to allow the datacenter to handle array configurations and management. Network vendors need to allow the datacenter to configure trunks, link aggregation, bandwidth control, etc… Systems vendors need to allow the datacenter to jump into the boot process, grab the hardware, and auto configuration.
Pie in the sky, right? Existing technologies seem to elude to more elegant management that would lend itself kindly to such a model. VMware, as the datacenter enabler, would need to step up to the plate and take the initiative and ownership of managing those resources… from RAID configurations to VLAN Trunking on switches.
Seriously… walking up and adding new physical resources or extending to a public provider for more resources and they become magically available would be fantastic.
So… that is my vision for where I would like to see the virtual datacenter. VMware, let me know if you want to talk about this in more detail. I am sure we can work something out!