click to visit home page
leading the way to the new storage frontier .....
pcie  SSDs - click to read article
PCIe SSDs ..
SSD symmetries article
SSD symmetries ..
SSD SoCs controllers
SSD controllers ..
memory channel storage
memory channel SSDs ...
SSDs over 163  current & past oems profiled
SSD news ..
read the article on SSD ASAPs
auto caching SSDs ..

the Problem with Write IOPS - in flash SSDs

by Zsolt Kerekes, editor - December 16, 2009
"Random IOPS" has been long established as a useful concept for modeling server applications performance and bottlenecks when using different types of storage.

As long ago as 1993 - the world's 1st NAS company Auspex Systems quoted a figure of "675 NFS IOPS" to illustrate the capability of its NS 3000 SPARC based server.

And in 1995 a RAM SSD manufacturer called CERAM was quoting a figure of "2,000 IOPs" for its SPARC compatible SSD.

Random R/W IOPS had become a very common metric used in the marketing parlance of high end SSD makers by 2003.

And the good thing was... it didn't matter if your background was in rotating disk storage - because the figures being quoted gave a realistic idea of what you could achieve if you used SSDs instead.

In those days - nearly all enterprise SSDs were "RAM SSDs" - and like hard disk arrays - they had symmetric R/W latencies and throughput performance. Therefore vendors didn't have to differentiate between "read IOPS" or "write IOPS" - it was rightly assumed that they were very similar.
SSD IOPS - start of the slippery slope to a devalued metric

In 2007 the SSD IOPS performance picture was starting to get hazy - with some flash SSD makers quoting high IOPS specs which actually exceeded those of entry level RAM SSDs.

This seemed too good to be true - especially as the price difference in those days (between RAM and flash SSDs) was more than 20 to 1 for the same capacity.

So I asked some leading SSD makers from both sides of the fence to contribute short pieces for a very popular article called - RAM SSDs versus Flash SSDs. That article (published in August 2007) listed important factual data about the state of the art. Data based on real products was compared for RAM SSDs, flash SSDs and hard disks.

2 important points were clarified by that article

1 - flash SSDs (generally) had much worse write IOPS than read IOPS - and the way that some flash SSD vendors quoted a "single" IOPS metric was confusing or misleading.

Following publication of this article adopted a new editorial a practice of differentiating whether IOPS figures quoted in vendor press releases and datasheets were really "read", "write" or "combined" (based on an assumed R/W ratio). Before that - some vendors had misleadingly just quoted read IOPS (the best figure) without attaching the important "read" appellation.

2 - for those already experienced with IOPS in HDDs and RAM SSDs who wanted to model and predict flash SSD performance - the key metric to look at was "write IOPS".

The critical effect of write IOPS on overall application performance - was analyzed in a sub-article - by Douglas Dumitru, CTO EasyCo - called Understanding Flash SSD Performance (pdf). In various tables Dumitru pointed out the sensitivity of overall performance to write IOPS for various R/W mixes encountered in real applications.

Back in those days - the ratio of read to write IOPS for commercially available flash SSDs could be anywhere in the region from 10-to-1 to 100- to-1 due to the integral asymmetries in nand flash memory.

Quoting "read IOPS" and "write IOPS" clears things up - but not for long

For me as editor - that seemed to settle the issue for a while. As long as stories related to enterprise flash SSDs always included separate figures for read and write IOPS - (or pointed out when they were missing) experienced readers could do useful mental filtering between different types of products.

Personally I found it a useful shortcut to ignore read IOPS data entirely and just focus on write IOPS to rank the fastest products. I assumed that the ratio of R/W IOPS - in the fastest flash SSDs - might still improve in ensuring years - but would hover in the range 5 or 10 to 1. But I was wrong.

Little more than a year later (in November 2008) - Violin Memory narrowed that gap to 2 to1 in a 2U SLC SSD which had over 200K random Read IOPS and 100K random Write IOPS. And then, in November 2008, Fusion-io started talking about 1 to 1 R/W IOPS in one of its PCIe SSDs.

And this R/W IOPS symmetry soon after made an appearance in 2.5" SSDs - when in April 2009 - SandForce unveiled its SF-1000 family of SSD Processors with 30,000 R/W IOPS.

Where Are We Now?

You might think that - with R/W IOPS symmetry in the fastest flash SSDs we can now forget all about the underlying stuff that happens inside them. Just plug in the numbers - compare to hard disk arrays or RAM SSDs - and the predicted application speedups will follow as expected.

But that would be wrong too. Because - as users are discovering when they test these newer super flash SSDs in their applications - the IOPS model gives good results in some applications (like video servers) - but is extremely unreliable at predicting speedups in other applications (like traditional database transaction processing).

The way that random write IOPS test software is written can produce results which inflate the results for flash SSDs and produce numbers which exaggerate performance.

For hard disks - writing to a range of addresses which are spatially scattered is a good stress test - because it forces head to head movements and rotational seeks- and these latencies get included in the results. So random write IOPS for an HDD produces a lower numeric result than repeatedly writing to a spatial addresses in close proximity.

For enterprise flash SSDs with small amounts of RAM cache the opposite is true. Writing to a range of addresses which are spatially scattered is a bad stress test - because each write effectively goes to an erased block in the storage pool. (Assuming active garbage collection and over provisioning.) The resultant latency "measured" by the IOPS test is simply the buffer or cache latency - and is much less than the time taken to complete a write-erase cycle to the flash media. In contrast to the situation with HDDs - writing repeatedly to a spatial addresses in close proximity in skinny flash SSDs produces a worse result - which can be orders of magnitude worse than measured on the same device for "random IOPS".

If repeat writes to the same small address range (same flash blocks) is interspersed with read operations (the play it again Sam scenario - which occurs in database tables) the performance outcome varies significantly between fat and skinny flash SSDs. A fat flash SSD may produce results which more closely follow the behavior seen in HDDs and RAM SSDs - as predicted by the write IOPS spec. But performance in most skinny flash SSD designs will collapse. Not only will it be orders of magnitude lower than expected from the write IOPS spec - but it may only be 3x better than an HDD - instead of 100x or 1,000x - predicted by the spec for the product.

The paradox for the potential use - is that 2 different flash SSD designs which have apparently identical throughput, latency and random IOPS specs in their datasheets - can behave in a way that is orders of magnitude different in real applications.

The sensitivity is due to the type of application and the implementation of the RAM cache algorithms in the SSD. None of these differences are explicitly stated in benchmark performance figures.

That's because traditional storage benchmarks and test suites don't model the type of behavior which correlates with common applications - and instead of stressing the weakest parts of the SSD design - they play to the tune of the SSD designer who has focused on key metrics which make the product look good in such benchmark tests.


Random R/W IOPS data for flash SSDs is a poor predictor of likely application performance in some common types of enterprise server applications. Users need to proceed cautiously when shortlisting SSD products which are only based on paper or benchmark evaluations. New types of SSD benchmark tests are needed which stress the weakest parts of flash SSD designs - instead of recycling storage I/O tests designed for HDDs or RAM SSDs.

For a unified overview of SSD architecture see - 11 key symmetries in SSD design.
SSD ad - click for more info
Can you trust flash SSD specs & benchmarks?
the Importance of Write Performance to RDBMS Apps
Calling for an End to Unrealistic SSD vs HDD IOPS Comparisons
Legacy vs New Dynasty - the new way of looking at Enterprise SSDs
How does the SSD performance change relative to the time it has been running?
  • in this system?
  • with this workload?
  • out of the system?
That's age symmetry...
11 Key Symmetries in SSD design

storage search banner

the Fastest SSDs
the Top 20 SSD companies
Are you ready to rethink RAM?

Do you remember that famous misquotation from the movie Casablanca?

"Play it again Sam - as time goes by..."

Repeating write operations in some apps and some flash SSDs can take orders of magnitude longer than predicted by IOPS specifications.

Time does indeed go by - potentially discrediting a long established performance modeling metric.

SSD ad - click for more info
what changed in SSD year 2014?
RAM Cache Ratios in flash SSDs
how fast can your SSD run backwards?
flash wars in the enterprise SSD market
factors which influence and limit flash SSD performance
Zsolt re your article

the Problem with Write IOPS - in flash SSDs
comments by:- Ron Bianchini CEO, Avere Systems
This article raises an excellent point regarding SSD performance, testing and fundamentally, their design.

The first SSD storage devices were very transparent in their design, allowing for very simple tests, like Random R/W, that could clearly highlight the differences between vendors. Modern SSD storage devices are much more complicated and include performance enhancements, like data caches, that can hide any performance deficiencies of the underlying media under certain bounded workloads.

At this point, the SSD storage device must be treated as a system, rather than as a transparent single media type.

Performance is now a function of the underlying storage media and the algorithms used to manage the various media in the system (including any internal caches).

At Avere Systems, we faced this exact problem when trying to evaluate flash SSD media for use in our Avere FXT NAS server. Without a uniform set of benchmarks, we needed to test multiple solutions to see how they performed under load in our system. Only then could we make an informed decision.

We also addressed this problem in terms of the product that we delivered to market. Our Avere FXT is a tiered NAS server that stores data in one of several tiers of media, two of which are SSD - one RAM and one flash based. We based our decision for which media to store data on application access patterns.

Most of our competitors require that the storage administrator decides when different media tiers are used for different applications.

The administrators are required to set policies that direct data to the various media. Administrator set policies is a losing proposition as performance is completely dependent on workload at the time of the operation, primarily due to the complexity of the design of modern SSDs.

By characterizing the performance of the SSDs in our system and making the decision internally, we can use them optimally for application workload and outperform our competition.

In summary, as the complexity of SSD storage devices increase, so must the testing strategies used to evaluate the components. For Avere, this meant a thorough evaluation and crafting our algorithms to leverage the best of various media.

What is needed for the more general purpose user is a uniform test, or set of tests, that users can use to accurately compare products from the various vendors under typical user load. Only then, can end users understand how the various components will perform in their system.
SSD ad - click for more info
new whitepaper from Diablo looks at the "drawbacks of traditional PCIe SSDs"
Editor:- April 22, 2014 - Diablo Technologies today published a white paper Enterprise Storage Performance: it's all about the Architecture (and not just the interface) (pdf)
graph from the white paper
This paper presents Diablo's perspective - as the creator of Memory Channel Storage (an architecture which supports terabyte class flash SSDs in unmodifed DDR-3 DRAM DIMM sockets and runs within those safe operating electrical power envelopes).

The article looks at the measured performance weaknesses of some specific (but unnamed) PCIe SSDs with respect to their measured performance asymmetries under various loads and then attempts to generalize those weaknesses as part of an argument that MCS is a better solution than PCIe SSDs - because MCS is more scalabe and doesn't suffer from the same kind of architectural bottlenecks.

Editor's comments:- This paper from Diablo reminds me of the type of comparisons between different types of PCIe SSDs (and their sensitivities to different workloads) which was one of the pivotal marketing points of difference which Virident used to hang their reputation on in the years leading up to its acquisition by HGST. (Although as I often reminded readers at the time - Virident wasn't the only company with that kind of array scalability or no-compromise performance.)

Going back to Diablo's white paper - for me - like many vendors written papers - it's good and bad in different parts.

The best bit is the middle - in which you get a reminder - from measured results that fast SSDs with similar capacities can behave differently.

The worst part in my view is the attempt to link these comparison results to a general conclusion about the merits of MCS versus PCIe SSDs.

Because I think the architecture of the controllers on the flash side of the SSDs plays such an important part as too does the software. And that's still just a small part of the picture.

My considered view is that Diablo's extrapolation towards a general market conclusion from one selected comparison example is not merited by the evidence presented in this article.

And while I am convinced of the benefits of Diablo's MCS architecture - and what it can do differently and better compared to PCIe SSDs - I think this particular paper isn't the best case they could argue.

related articles
"One million IOPS. Yawn! Is that all you've got? In fact, I would argue if the extent of your marketing message is your IOPS you don't have enough marketing talent."
Woody Hutsell - (after more than 10 years of marketing enterprise SSDs) in his first SSD blog (December 20, 2010)
1.0" SSDs 1.8" SSDs 2.5" SSDs 3.5" SSDs rackmount SSDs PCIe SSDs SATA SSDs
SSDs all flash SSDs hybrid drives flash memory RAM SSDs SAS SSDs Fibre-Channel SSDs is published by ACSL