the Problem with Write IOPS
- in flash SSDs
Repeating write operations in some apps and
some flash SSDs can take orders of magnitude longer than predicted by IOPS
specs. Time does
indeed go by - potentially discrediting a long established performance
modeling metric.
"Random IOPS" has been long
established as a useful concept for modeling server applications performance
and bottlenecks
when using different types of storage.
As long ago as 1993 -
the world's 1st
NAS company
Auspex Systems quoted a
figure of "675 NFS IOPS" to illustrate the capability of its NS 3000
SPARC based server.
And
in 1995
a RAM SSD
manufacturer called CERAM was quoting a figure of "2,000 IOPs" for
its SPARC compatible SSD.
Random R/W IOPS had become a very common
metric used in the marketing parlance of high end SSD makers by
2003.
And the good thing was... it didn't matter if your background was in
rotating disk storage - because the figures being quoted gave a realistic idea
of what you could achieve if you used SSDs instead.
In those days -
nearly all enterprise SSDs were "RAM SSDs" - and like hard disk arrays
- they had symmetric
R/W latencies and throughput performance. Therefore vendors didn't have to
differentiate between "read IOPS" or "write IOPS" - it was
rightly assumed that they were very similar. |
SSD IOPS - start of the
slippery slope to a devalued metric
In
2007 the
SSD IOPS performance picture was starting to get hazy - with some
flash SSD makers
quoting high IOPS specs which actually exceeded those of entry level RAM
SSDs.
This seemed too good to be true - especially as the price
difference in those days (between RAM and flash SSDs) was more than 20 to 1 for
the same capacity.
So I asked some leading SSD makers from both sides
of the fence to contribute short pieces for a very popular article called -
RAM SSDs versus
Flash SSDs. That article (published in August 2007) listed important
factual data about the state of the art. Data based on real products was
compared for RAM SSDs, flash SSDs and hard disks.
2 important points
were clarified by that article
1 - flash SSDs (generally) had much
worse write IOPS than read IOPS - and the way that some flash SSD vendors
quoted a "single" IOPS metric was confusing or misleading.
Following
publication of this article StorageSearch.com adopted a new editorial a
practice of differentiating whether IOPS figures quoted in vendor press releases
and datasheets were really "read", "write" or "combined"
(based on an assumed R/W ratio). Before that - some vendors had misleadingly
just quoted read IOPS (the best figure) without attaching the important "read"
appellation.
2 - for those already experienced with IOPS in HDDs and
RAM SSDs who wanted to model and predict flash SSD performance - the key metric
to look at was "write IOPS".
The critical effect of write
IOPS on overall application performance - was analyzed in a sub-article - by
Douglas Dumitru, CTO EasyCo - called
Understanding
Flash SSD Performance (pdf). In various tables Dumitru pointed out the
sensitivity of overall performance to write IOPS for various R/W mixes
encountered in real applications.
Back in those days - the ratio of
read to write IOPS for commercially available flash SSDs could be anywhere in
the region from 10-to-1 to 100- to-1 due to the integral asymmetries in nand
flash memory.
Quoting
"read IOPS" and "write IOPS" clears things up - but not for
long
For me as editor - that seemed to settle the issue for a
while. As long as stories related to enterprise flash SSDs always included
separate figures for read and write IOPS - (or pointed out when they were
missing) experienced readers could do useful mental filtering between different
types of products.
Personally I found it a useful shortcut to
ignore read IOPS data entirely and just focus on write IOPS to rank the
fastest products.
I assumed that the ratio of R/W IOPS - in the fastest flash SSDs - might
still improve in ensuring years - but would hover in the range 5 or 10 to 1.
But I was wrong.
Little more than a year later (in
November 2008)
- Violin Memory narrowed
that gap to 2 to1 in a 2U SLC SSD which had over 200K random Read IOPS and
100K random Write IOPS. And then, in November 2008,
Fusion-io started
talking about 1 to 1 R/W IOPS in one of its
PCIe SSDs.
And
this R/W IOPS symmetry soon after made an appearance in
2.5" SSDs - when
in April 2009 -
SandForce unveiled
its SF-1000 family of SSD
Processors with 30,000 R/W IOPS.
Where Are We Now?
You
might think that - with R/W IOPS symmetry in the fastest flash SSDs we can
now forget all about the underlying stuff that happens inside them. Just plug in
the numbers - compare to hard disk arrays or RAM SSDs - and the predicted
application speedups will follow as expected.
But that would be wrong
too. Because - as users are discovering when they test these newer super flash
SSDs in their applications - the IOPS model gives good results in some
applications (like video servers) - but is extremely unreliable at predicting
speedups in other applications (like traditional database transaction
processing).
The way that random write IOPS test software is written
can produce results which inflate the results for flash SSDs and produce
numbers which exaggerate performance.
For hard disks - writing to a
range of addresses which are spatially scattered is a good stress test -
because it forces head to head movements and rotational seeks- and these
latencies get included in the results. So random write IOPS for an HDD produces
a lower numeric result than repeatedly writing to a spatial addresses in close
proximity.
For enterprise flash SSDs with small amounts of RAM cache
the opposite is true. Writing to a range of addresses which are spatially
scattered is a bad stress test - because each write effectively goes to an
erased block in the storage pool. (Assuming active garbage collection and over
provisioning.) The resultant latency "measured" by the IOPS test is
simply the buffer or cache latency - and is much less than the time taken to
complete a write-erase cycle to the flash media. In contrast to the situation
with HDDs - writing repeatedly to a spatial addresses in close proximity in
skinny flash SSDs
produces a worse result - which can be orders of magnitude worse than measured
on the same device for "random IOPS".
If repeat writes to
the same small address range (same flash blocks) is interspersed with read
operations (the play it again Sam scenario - which occurs in database
tables) the performance outcome varies significantly between fat and skinny
flash SSDs. A fat flash SSD may produce results which more closely follow the
behavior seen in HDDs and RAM SSDs - as predicted by the write IOPS spec. But
performance in most skinny flash SSD designs will collapse. Not only will it be
orders of magnitude lower than expected from the write IOPS spec - but it may
only be 3x better than an HDD - instead of 100x or 1,000x - predicted by the
spec for the product.
The paradox for the potential use - is that 2
different flash SSD designs which have apparently identical throughput, latency
and random IOPS specs in their datasheets - can behave in a way that is orders
of magnitude different in real applications.
The sensitivity is due to
the type of application and the implementation of the RAM cache algorithms in
the SSD. None of these differences are explicitly stated in benchmark
performance figures.
That's because traditional storage benchmarks
and test suites don't model the type of behavior which correlates with common
applications - and instead of stressing the weakest parts of the SSD design -
they play to the tune of the SSD designer who has focused on key metrics which
make the product look good in such benchmark tests.
Conclusions
Random
R/W IOPS data for flash SSDs is a poor predictor of likely application
performance in some common types of enterprise server applications. Users need
to proceed cautiously when shortlisting SSD products which are only based on
paper or benchmark evaluations. New types of SSD benchmark tests are needed
which stress the weakest parts of flash SSD designs - instead of recycling
storage I/O tests designed for HDDs or RAM SSDs.
For a unified overview
of SSD architecture see -
11 key symmetries in
SSD design. |
|

| |
 |
| . |
 |
| . |
 |
| comments by:- Ron Bianchini
CEO, Avere
Systems |
This article raises an
excellent point regarding SSD performance, testing and fundamentally, their
design.
The first SSD storage devices were very transparent in their design,
allowing for very simple tests, like Random R/W, that could clearly highlight
the differences between vendors. Modern SSD storage devices are much more
complicated and include performance enhancements, like data caches, that can
hide any performance deficiencies of the underlying media under certain bounded
workloads.
At this point, the SSD storage device must be treated as a system,
rather than as a transparent single media type. Performance is now a function
of the underlying storage media and the algorithms used to manage the various
media in the system (including any internal caches).
At Avere Systems, we faced this exact problem when trying to evaluate
flash SSD media for use in our Avere
FXT NAS server.
Without a uniform set of benchmarks, we needed to test multiple solutions to see
how they performed under load in our system. Only then could we make an
informed decision.
We also addressed this problem in terms of the product that we
delivered to market. Our Avere FXT is a tiered NAS server that stores data in
one of several tiers of media, two of which are SSD - one RAM and one flash
based. We based our decision for which media to store data on application access
patterns. Most of our competitors require that the storage administrator
decides when different media tiers are used for different applications. The
administrators are required to set policies that direct data to the various
media. Administrator set policies is a losing proposition as performance is
completely dependent on workload at the time of the operation, primarily due to
the complexity of the design of modern SSDs. By characterizing the performance
of the SSDs in our system and making the decision internally, we can use them
optimally for application workload and outperform our competition.
In
summary, as the complexity of SSD storage devices increase, so must the testing
strategies used to evaluate the components. For Avere, this meant a thorough
evaluation and crafting our algorithms to leverage the best of various media.
What is needed for the more general purpose user is a uniform test, or set of
tests, that users can use to accurately compare products from the various
vendors under typical user load. Only then, can end users understand how the
various components will perform in their system.
| | |
| . |
|
|
| . |
| re multi-million IOPS SSD
marketing |
Editor:- December 22, 2010 -
Woody Hutsell
- who for a decade led the enterprise SSD marketing business at Texas Memory Systems
- and who recently joined ViON
has recently started a blog about SSDs.
His first article bemoans the
current trend of marketers to
quote
ever higher millions of IOPS - a marketing tactic - which he freely admits
he started back in his days at TMS.
In this entertaining and thought
provoking article Woody says - "One million IOPS. Yawn! Is that all youve
got! In fact, I would argue if the extent of your marketing message is your
IOPS you dont have enough marketing talent." ...read
the article | | |
| . |
|
|
| . |
| Write
performance versus free capacity in PCIe flash SSDs |
In December 2010 - An
article on
SSDperformanceblog.com
discussed the write performance of
Virident's PCIe SSD
- the tachIOn - which
was measured at various percentages full.
The author - Vadim Tkachenko - says "Virident's
card maintains very good throughput level in close to full capacity mode, and
that means you do not need to worry ( or worry less) about space reservation or
formatting card with less space."
In an
earlier
test which Vadim Tkachenko had done on a
DUO PCIe SSD from
Fusion-io - the
results showed a greater dependence of the write performance to the amount of
used capacity - or in effect the amount of
over-provisioning. | | |
| . |
|
| the 3 fastest PCIe
SSDs? |
Are you tied up in
knots trying to shortlist flash SSD accelerators ranked according to
published comparative benchmarks?
You know the sort of thing I mean -
where a magazine compares 10 SSDs or a blogger compares 2 SSDs against each
other. It would be nice to have a shortlist so that you don't have to waste too
much of your own valuable time testing unsuitable candidates wouldn't it?
StorageSearch's long running
fastest SSDs directory
typically indicates 1 main product in each form factor category but those
examples may not be compatible with your own ecosystem.
If so a
new article -
the 3 fastest PCIe
SSDs list (or is it really lists?) may help you cut that Gordian
knot. Hmm... you may be thinking that StorageSearch's editor never gives easy
answers to SSD questions if more complicated ones are available.
|
 |
But in this case you'd be
wrong. (I didn't say you'd like the answers, though.) ...read the article | | | |
| . |
|
|
| . |
 | |