storage reliability
storage reliability

storagesearch.com

storage search
10 years - "leading the way to the new storage frontier"

Understanding Data Failure Modes in Large Solid State Storage Arrays - call for papers

June 25, 2008 - by Zsolt Kerekes editor of STORAGEsearch.com

Multi-terabyte solid state storage arrays are seeping into the server environment in the same way that RAID systems did back in the early 1990s.

But just as those RAID pioneers learned that there was a lot more to making a reliable disk array than stuffing a lot of PC hard disks into a box with a fan and a power supply - so too will multi-terabyte SSD users discover that problems which are undetectable or do no harm in small SSDs can lead to serious data corruption risks when those same SSDs are scaled up without the right architecture and sometimes with it in place too.

I know from the emails I get that many readers think that once they've looked at the single issue of flash endurance - they've covered covered the bases for enterprise SSDs.

That's why storagesearch.com is planning to publish a collection of definitive technology articles to help guide the industry through this risky transition process - which will be linked from this page starting in the 4th quarter of 2008.

Users with significant storage investments need simple guidelines to help them get the best results from the different types of media they use. That's always been true in the past and will remain so in the future.

A good theoretical understanding of data failure modes is what lies behind the way that mature storage products are designed and managed. But these complex considerations can be translated into simple guides for users as the table below shows.

The new articles will provide users with the theoretical justifications they need when they are faced with the difficult economic choices that come from deploying different types of SSDs (with different cost models) in different applications within their organizations.
click here to see 100 more SSD articles
.
The first phase in the SSD market revolution was when users became aware of the potential benefits of SSDs and when these products reached price points many of them could afford.

The next phase will be when enterprise users move away from a technology focused market (which is what they are being offered by vendors now) towards an applications specific SSD market in which they have to choose which products work best for their own specific deployments.

Users today are faced with the dilemma of paying vastly different price points for products which are superficially similar from the capacity and IOPS point of view - but which may be vastly different in data reliability.

By "data reliability" I don't mean that the SSD has failed - but that some data within the SSD array has been altered or corrupted. (And will continue accumulating data corruptions even if you swap in new replacement drives of the same type.)

The cost of data corruption is different for different applications and in different business applications.

Balancing risk against cost is a decision users make when they choose a supplier - even if they have not consciously analyzed the issues which matter. And choosing a more expensive supplier doesn't protect the user from being mis-sold the wrong type of product.

Many mistakes will be made by vendors and users.

For the next phase in the SSD market revolution to continue momentum users need guidance they can trust to help them navigate the many complex decisions which are beyond performance speedup or power saving considerations.
Megabyte's Simple List of Handling and Managing Storage Media

You've done this before. Here are some examples from past decades.
tape don't forget to rewind the tape often (otherwise bad things happen) and don't cut little bits off just because you've run out of string
hard disk drives don't drop the hard disk on the concrete floor, and don't store spare disks in your MRI scanner or under the magnet in your crane
RAM don't zap the chips when doing memory upgrades (you look good in that nylon shirt BTW)
CD, DVD, UDO wipe the jam from your sticky fingers before inserting optical media into the archive appliance
multi terabyte SSDs buy the right product for the job?

think of solid state storage by application type rather than technology type (RAM SSD vs flash SSD)

That's what the new articles will help us to understand.
By publishing this call for papers I'm inviting technology experts in the industry to contribute new original papers, or let me know about suitable articles they have already written on this subject to which I can direct readers (without needing to sign up to read them).

I will also be contacting candidate authors by email.

in the meantime take a look at existing articles on this theme.
  • Can you trust your flash SSD specs? - the product which you carefully qualified may not be identical to the one that's going into your production line, because the SSD oem has "improved" it. But the improvement makes another operating parameter - which you deeply care about - unacceptably worse.
  • Flash SSD Data Reliability and Lifetime (pdf) - starting from a description of floating gates and going all the way up to the architecture of a flash SSD this paper includes good descriptions of data failure modes, including:- erase failure, (erase) stress induced long term leakage, disturb faults, and the potential for inadequate error correction code coverage in MLC.
But don't go away with the idea that RAM SSD arrays don't have data corruption modes too. The difference may be that some long established vendors in this part of the market have been designing products which mitigate these risk factors. But that doesn't mean to say that new market entrants know what they should be doing.

Even big oems can make elementary mistakes which cost billions of dollars of lost sales - as I described in my 2001 article Looking Back on Sun's Cache Memory Problem.

storage search banner

STORAGEsearch storage manufacturers Storage History Solid state disks - directory & news More Articles about Solid State Disks
STORAGEsearch is published by ACSL