Before we begin, I want to state right up front that I am not anti-flash, nor am I anti-hardware. I work for DataCore Software which has mastered the ability to exploit hardware capabilities for nearly two decades for the sole purposes of driving storage I/O. Our software needs hardware and hardware needs software to instruct it to do something useful. However, over the last the last year I have read a lot of commentary about how disk (aka. HDDs or magnetic media) is dead. Colorful metaphors such as “spinning rust” are used to describe the apparent death of the HDD market, but is this really the case?
According to a report from TrendFocus, the number of drives that shipped in 2015 declined by 16.9% (to 469 million units), however the amount of capacity that shipped increased by more than 30% (to 538 exabytes, or 538,000 petabytes, or 538,000,000 terabytes). In other words, a lot of HDD capacity.
Please note however, this is NEW capacity added to the industry on top of the already mind-blowing amount of existing capacity in the field today (estimated at over 10 zettabytes, or 10,000 exabytes, or, well, you get the idea). Eric Brewer, VP of Infrastructure at Google recently said,
“YouTube users are uploading one petabyte every day and at current growth rates that they should be uploading 10 petabytes per day by 2021.”
The capacity trend certainly doesn’t show signs of slowing which is why new and improved ways of increasing HDD density are emerging (such as Helium-filled drives, HAMR, and SMR). With these new manufacturing techniques, HDD capacities are expected to reach 20TB+ by 2020.
So, I wouldn’t exactly say disk (HDD) is dead, at least from a capacity demand perspective, but it does raise some interesting questions about the ecosystem of drive technology. Perhaps the conclusion that disk is dead is based on drive performance. There is no doubt a battle is waging in the industry. On one side we have HDD, on the other SSD (or flash). Both have advantages and disadvantages, but must we choose between one or the other? Is it all or nothing?
MOVE TO FLASH NOW OR THE SKY WILL FALL
In addition to the commentary about disk being dead, I have seen an equal amount of commentary about how the industry needs to adopt all-flash tomorrow or the world will come to an end (slight exaggeration perhaps). This is simply an impossible proposition. According to a past Gartner report,
“it will be physically impossible to manufacture a sufficient number of SSDs to replace the existing HDD install base and produce enough to cater for the extra storage growth”
Even displacing 20% of the forecasted growth is a near impossibility. And I will take this one step further, not only is it impossible, it is completely unnecessary. However, none of this implies HDD and SSD cannot coexist together in peace, they certainly can. What is needed is exactly what Gartner said in the same report,
“ensure that your choice of system and management software will allow for seamless integration and intelligent tiering of data among disparate devices.”
The reason Gartner made this statement is because they know only a small percentage of an organization’s data footprint benefits from residing on high-performance media.
THE SOLUTION TO THE PROBLEM IS SOFTWARE
One of the many things DataCore accomplishes with the hardware it manages is optimizing the placement of data across storage devices with varying performance characteristics. This feature is known as auto-tiering and DataCore does this automatically across any storage vendor or device type whether flash or disk based.
Over the last six years, DataCore has proven with its auto-tiering capability that only 3-5% of the data within most organizations benefit from high-performance disk (the percentage is even less when you understand how DataCore’s Parallel I/O and cache works, but we will touch on this later). Put another way, 95% of an organization’s I/O demand occurs within 3-5% of the data footprint.
While the 3-5% data range doesn’t radically change from day to day, the data contained within that range does. The job of DataCore’s auto-tiering engine is to ensure the right data is on the right disk at the right time in order to deliver the right performance level at the lowest cost. No need to wait, schedule, or perform any manual steps. By the way, the full name of DataCore’s auto-tiering feature is: fully automated, sub-LUN, real-time, read and write-aware, heterogeneous auto-tiering. Not exactly a marketing-friendly name, but there it is.
WAIT A SECOND, I THOUGHT THIS WAS ABOUT DISK, NOT FLASH
While DataCore can use flash technologies like any other disk, it doesn’t require them. To prove the point, I will show you a very simple test I performed to demonstrate the impact just a little bit of software can have on the overall performance of a system. If you need a more comprehensive analysis of DataCore’s performance, please see the Storage Performance Council’s website.
In this test I have a single 2U Dell PowerEdge R730 server. This server has two H730P RAID controllers installed. One RAID controller has five 15k drives attached to it forming a RAID-0 disk group (read and write cache enabled). This RAID-0 volume is presented to Windows and is designated as the R: drive.
The other RAID controller is running in HBA mode (non-RAID mode) with another set of five 15k drives attached to it (no cache enabled). These five drives reside in a DataCore disk pool. A single virtual disk is created from this pool matching the size of the RAID-0 volume coming from the other RAID controller. This virtual disk is presented to Windows and is designated as the S: drive.
The first set of physical disks forming the RAID-0 volume as seen in the OpenManage Server Administrator interface – larger
The second set of physical disks and disk pool as seen from within the DataCore Management Console – larger
The logical volumes R: and S: as seen by the Windows operating System – larger
DRIVERS, START YOUR ENGINES
I am going to run an I/O generator tool from Microsoft called DiskSpd (formally known as SQLIO) against these two volumes simultaneously and compare the results using Windows Performance Monitor. The test parameters for each test are identical: 8K block, 100% random, 80% read, 20% write, running 10 concurrent threads, with 8 outstanding I/Os against a 10GB test file.
DiskSpd test parameters for each logical volume – larger
The first command on line 2 is running against the RAID-0 disk (R:) and the second command on line 5 is running against the DataCore virtual disk (S:). In addition to having no cache enabled on the HBA connecting the physical disks presented to DataCore within the pool, the DataCore virtual disk also has its write-cache disabled (or write-through enabled). Only DataCore read cache is enabled here.
Write-cache disabled on the DataCore virtual disk – larger
Performance view of the RAID-0 disk – larger
Performance view of the DataCore virtual disk – larger
As you can see from the performance monitor view, the disk being presented from DataCore is accepting over 26x more I/O per second on average (@146k IOps) than the disk from the RAID controller (@5.4k IOps) for the exact same test. How is this possible?
This is made possible by DataCore’s read cache and the many I/O optimization techniques DataCore uses to accelerate storage I/O throughout the entire stack. For much more detail on these mechanisms, please see my article on Parallel Storage.
In addition to Parallel I/O processing, I am using another nifty feature called Random Write Accelerator. This feature eliminates the seek time associated with random writes (operations which cause lots of armature action on the HDD). DataCore doesn’t communicate with the underlying disks the same way the application would directly. By the time the I/O reaches the disks in the pool the I/O pattern is much more orderly and therefore more optimally received by the disks.
So now as any good engineer would do, I’m going to turn it up a notch and see what this single set of five physical so-called “dead disks” can do. I will now test using five 50GB virtual disks. Remember, these virtual disks are coming from a DataCore disk pool which contain five 15k non-RAID’d disks. Let’s see what happens.
DiskSpd test parameters for five DataCore virtual disks – larger
The commands on lines 8-12 are running against the five DataCore virtual disks. Below are the results of the testing.
Performance view of the five DataCore virtual disks – larger
Note, nothing has changed at the physical disk layer. The change is simply an increase in the number of virtual disks now reading from and writing to the disk pool which in turn has increased the degree of parallelism in the system. This test shows for the same physical disks we have achieved greater than a 63x performance increase on average (@344k IOps) with bursts well over 400k IOps. This test is throwing 70-80,000 write I/Os per second at physical disks which are only rated to deliver 900 random writes per second combined. This is made possible by sequentializing the random writes before they reach the physical disks and therefore eliminating most of the armature action on the HDDs. Without adding any flash to the system, the software has effectively returned greater than flash-like performance with only five 15k disks in use.
Another important note. This demonstration is certainly not representative of the most you can get out of a DataCore configuration. On the latest SPC-1 run where DataCore set the world-record for all out performance, DataCore reached 5.12 million SPC-1 IO per second with only two engines (and the CPUs on those engines were only 50% utilized).
There are two things happening in the storage industry which has caused a lot of confusion. The first is an unawareness of the distinction between I/O parallelization and device parallelization. DataCore has definitively proven its I/O parallelization technique is superior in performance, cost, and efficiency. Flash is a form of device parallelization and can only improve system performance to a point. Device parallelization without I/O parallelization will not take us where the industry is demanding we go (see my article on Parallel Storage).
The second is a narrative being pushed on the industry which says “disk is dead” (likely due to my first concluding point). The demonstration above proves “spinning disk is very much alive”. Someone may argue I’m using a flash-type device in the form of RAM to serve as cache. Yes, RAM is a solid state device (a device electronic in nature), but it is not exotic, has superior performance characteristics, and organizations already have tons of it sitting in very powerful multiprocessor servers within their infrastructures right now. They simply need the right software to unlock its power.
Insert DataCore’s software layer between the disk and the application and immediately unbind the application from traditional storage hardware limitations.