click to visit StorageSearch.com home page
leading the way to the new storage frontier .....
SSDs over 163  current & past oems profiled
SSD news ..
consumer SSDs guide
consumer SSDs ..
SSD myths - write endurance
SSD endurance ..
Data Recovery
Data Recovery ..
Notebook SSDs
Notebook SSDs ..
.....

Why can consumers expect to see more flaky flash SSDs?

The 1st version of this article was published in 2009

4 years later in 2013 (as I predicted) things haven't got much better


Zsolt Kerekes editor
another new season for that depressing consumer SSD saga

Editor:- October 18, 2011 - Like anyone who makes a lot of market predictions I'm delighted if any of them come true - but there are 1 or 2 cases where I would be just as happy to be proved wrong - in particular - on the subject of consumer SSD reliability.

2 years ago I wrote an article called - Why can consumers expect to see more flaky flash SSDs? - which had the sub-headline - "You need to stay vigilant because it's not going to get better anytime soon." (That earlier article is at the bottom of this page.)

I get a lot of emails about this subject - which is threatening to tarnish the reputation of the whole SSD market - and not just the small part which is consumer drives.

Today I got an email from a reader who told me that out of 13 client SSDs he'd bought 7 months ago "4 have died so far."

He gave me a link to a blog on codinghorror.com which illustrates the soul searching and frustration that SSD unreliability is causing to so many thoughtful people who want to get the speedup advantages of SSDs but are rightly anxious about the flaky reputation of consumer SSDs.

I agree this is a lamentable state of affairs - which needs some explanations. This is a tidied up version of what I said.

SSDs aimed at the consumer market are designed to deliver basic functionality at the lowest price. That means the designers (originally due to ignorance – but nowadays with foreknowledge) have to decide what shortcuts they can take in the production process and what design factors they can leave out to reduce the price - compared to a reliable industrial / military / enterprise grade SSD.

There are countless techniques they can use to get the cost down.
  • Shutting off reliability features in the controller. For example the SandForce SF-2200 controller (launched in Feb 2011 and optimized for consumer markets) has an option which enables oems to deliver an SSD with a smaller or larger usable capacity when using exactly the same set of flash chips. The bigger capacity sounds like it's better value for money to the consumer – but they are losing some of the RAIS protection which means the SSD won't survive the failure of an entire flash chip. And that's just the tip of the SSD capacity iceberg.
  • Using cheaper components in the power loss management system. The consequences are that the trigger events to save data may come at the wrong time or that the capacitors don't hold enough charge to maintain reliable operation for vital data saves – because they have drifted out of tolerance. That's before considerations like whether the controller has an intrinsic foolproof auto recovery architecture in the first place.
  • Using no-name cheap flash memory. The difference between the best and worst flash manufacturers is a many times factor. Also if the memory is unknown – the controller parameters may not be set up correctly for it – leading to wrong handling by the controller.
  • Saving time and cost on testing the design. Many consumer SSD products aren't adequately validated before they're shipped. That's why you hear about firmware upgrades and recalls. It's expensive for SSD makers to invest in comprehensive tests before they ship - and some consumer SSD marketers worry that if they delay their launches they run the risk of losing market share.
When companies want to design reliable SSDs – for non consumer markets – there are many additional steps they take in their processes – such as qualifying the memory to see which is the best, allocating more memory to act as hot spares for defective blocks, using better reliability architecture, and burn-in and functional test before shipping.

But even when all those precautions are taken - expensive SLC flash enterprise SSDs can fail too. The difference in the enterprise is that the data is more likely to be backed up and the storage system is likely to be protected by a RAID-like or fail-over architecture which means that life can go on without too much disruption.

If it's any consolation – the hard disk industry was even worse at one time. In 1986 when I was designing a demo RAID controller - most of the brand new drives I got for the project arrived with serious faults.

What is the SSD industry doing to improve the state of the art? You can get an idea of who's doing what in the SSD reliability papers.


Why can consumers expect to see more flaky flash SSDs? - the 2009 version

Editor:- August 10, 2009 - Intel has been in the computer news in recent weeks for suspending further shipments of its new X25-M - SATA 2.5" MLC flash SSD due to a serious design problem.

Potential customers were advised that shipments will resume after what is euphemistically called - a "firmware upgrade".

This isn't the 1st time that Intel has shipped a flaky SSD, and it's not the 1st time that a flash SSD manufacturer has shipped products where the design was incompletely verified (or specified) in the 1st place - requiring a frantic firmware upgrade to make its operational use more satisfactory.

And it won't be the last either. These stories have become commonplace. And because the latest Intel fiasco wasn't a surprise - it didn't rate more than a footnote in these pages.

Newcomers to StorageSearch.com may be shocked that by the frequency with which the storage market's reputation is being splatted by the residue from so many unreliable new products. And by that I mean - products whose operation you cannot rely on to be what you reasonable expected - instead of the narrower meaning of "products which are dead on arrival" - or which terminally cease operation due to some form of wear-out, environmental or age related process.

I named storage reliability as 1 of the 3 most important future trends in my state of the storage market article published in 2005. In that article I also predicted that uncorrectable failures in storage systems (due to embedded design assumptions made in earlier generations) could, if not dealt with by drive and interface designers, pose a more serious threat to enterprise computer systems than the Y2K bug in the late 1990s.

It's reasonable to ask - why has the flash SSD market gained such a poor reputation?

I explained why users shouldn't trust published flash SSD benchmarks in an article published a year ago - in which I discussed the technical environment and specific reasons related to performance. But it's clear from readers emails that many concerned SSD consumers don't have the technical background to understand much of the content in this and the more detailed articles comparing MLC and SLC, RAM and flash SSDs etc. I often have to explain that when it comes to SSDs - StorageSearch.com isn't aimed at the consumer market - but at enterprise users and oem specifiers of SSD technology who - in the course of their vendor qualification - invest a lot of time and resources into learning about SSDs - before making a commitment which could cost them their jobs or business if the choice turns out to be wrong. When we started our intense SSD coverage more than 10 years ago, a typical flash SSD cost $50K and a typical rackmount cost an order of magnitude more.

Why do you have to be so "extremely" careful to understand the internals of a flash SSD?

You probably don't - if all you're doing is buying a single notebook SSD for a single notebook. (Treading "very" carefully - will suffice.) You definitely do need to exercise extreme caution- if you're the person designing SSDs into a new notebook or new SSD storage array, IPTV server, voicemail server, defense appliance or search-engine architecture. That's only a small part of the spectrum of reader questions I get asked about related to SSDs - and why I try to avoid giving simplistic answers. Until now...

Here's a simplistic explanation of why you have to be careful about understanding flash SSD technology before you deploy it in serious apps, and why SSD vendors will continue shipping flaky SSDs and then recalling them or fixing them after the sale for several more years...

Flash SSDs are solid state - but different to processors because... When Intel or AMD or Sun design a new processor - they run test suites on the new design - which encapsulate market knowledge amassed during more than 20 years. These verification suites simulate the kinds of loads which are common and uncommon in the market. When a new processor emerges into the market - the design is already compatible with what the market expects - because the test suite defines the product. Similar test suites have been developed by all hard drive manufacturers - which include a knowledge awareness of all the tricky quirky ways that operating systems and applications are going to hit the hard drive - and how it is expected to respond.

There are no such industry test suites for flash SSDs. That's because a flash SSD can look like a disk drive in some contexts - and it can look like a processor accelerator in others. The range of applications for SSDs is much greater than for hard drives - and SSDs have many different ways they can implement the same features depending on which market they were designed for. One reason why the flash SSD market has been going wrong with so many new products is that most consumer SSD manufacturers have simply re-used their hard drive test suites to validate their new SSDs. That has put pressure on designers to tweak the SSD controllers to make their products look good in benchmarks (and look more saleable). But the test suites don't test the weaknesses of the SSDs - only the strengths.

There are 2 general exceptions.

Flash SSDs designed specifically for the industrial oem market tend to be better designed - because these vendors know their products will be hammered by lengthy customer evaluations before being deployed. Many industrial flash SSD oems are used to testing their products with the entire code of their customers' embedded products. Over many years they have built up experience of the weak parts of their products - and either adapted the firmware in their SSDs - or suggested ways in which their customers can change their software to work better with their SSDs.

Rackmount flash SSD arrays designed for enterprise server acceleration include a span of products and companies - whose applications experience ranges from nil to decades. But I can offer a simple solution to shortlisting an SSD supplier here.

In the world's first comprehensive survey of What SSD Users Want - instigated by StorageSearch.com in 2004 - we posed the question - "What would make it easier for you to buy SSD technology and remove doubts and risks which currently act as roadblocks?" - The top 2 factors quoted in replies were performance guarantees and try before you buy. Some enterprise SSD oems seeing this market feedback were quick to adapt these concepts to the way they did business.

My advice? - If you're planning to make a big enterprise SSD purchase - tell suppliers that you'll only consider their products if they offer you a money back performance guarantee (which they can easily do if they have enough experience with your type of application) or ask them if they will let you "try before you buy" - (if your application environment is unusual and outside the scope of their speedup models).

How can consumers and SMBs navigate around SSD landmines?

If you're a consumer or small business looking at a modest spend on flash SSDs it's probably unrealistic for you to invest the resources to learn about this technology and safely qualify products for yourself. As I've already said above - you can't trust magazine reviews either. They should just be regarded as an indication of what is possible - rather than a guarantee of what you'll see in practice.

My advice is - talk to a specialist SSD reseller.

I know of less than 10 SSD VARs worldwide who have been focusing exclusively on SSDs for consumers as their primary business for many years - but I'm not an expert on VARs. There may be more.

It's counter productive for SSD VARs to recommend products which are difficult to get hold of - or which have high return rates. Tell them what is important to you - and ask what they recommend. Ask them how long they've been in the SSD market too. If it's less than 2 years - go somewhere else. One way you can independently verify if what they say is true - is checking out web references to their SSD activities - including for example their website listings in past years in the Internet Archive.

Will it get easier to navigate the SSD market in future?

Yes. Sure. But that could be another 5 years in the future. I think the SSD market will get a lot more complicated and confusing before it gets any simpler.

To help you understand what's going on and see the future clearly I hope you'll come back to StorageSearch.com as we continue our long term mission - of "leading the way to the new storage frontier".

storage search banner

.........................................................................................................................
.....
click here to see 100 more SSD articles
If Megabyte spent less time on email he'd get more SSD articles written.
..
SSD ad - click for more info
..
SSD news
Data Recovery for flash SSDs?
Can you trust your flash SSD specs?
What's the best / cheapest PC SSD?
the Problem with Write IOPS - in flash SSDs
Clarifying SSD Pricing - where does all the money go?
.
SSD Review exposes how rebranded memory can adulterate consumer SSDs
Editor:- February 18, 2013 - the SSD Review recently published an in-depth article which shows how the memory chips in consumer SSDs - which appear to come from one source - may actually have come from somewhere else.

The article - by Les Tokar, Editor-in-Chief of the SSD Review - reads at times like a gripping detective story - and looks into the murky topic of remarking and rebranding flash chips - which can lead to adulteration and quality problems in the memory supply chain - all in pursuit of getting the lowest manufacturing cost.

These problems and risks have been well known in expert SSD circles but Les Tokar's new exposé brings this shadow world into vivid focus. ...read the article
.
..."Not every manufacturer takes product quality seriously.
When an SSD manufacturer tries to downgrade Nand Flash to
lower the price and impress consumers, they also pass on the risk
of data loss to consumers."
...Email from Renice Technology (September 2011) warning about
buying SSDs from oems which don't test and qualify the quality
and compatibility of their raw flash suppliers.
.
"If Intel's SSD design business was a horse - it would have been shot a long time ago and put out of its misery..."
...Editor commenting to a reader (July 2011) about reports of yet another flaky design problem with Intel SSDs - this time related to power cycling.
.
Surviving SSD sudden power loss
Why should you care what happens in an SSD when the power goes down?

This important design feature - which barely rates a mention in most SSD datasheets and press releases - has a strong impact on SSD data integrity and operational reliability.

This article will help you understand why some SSDs which (work perfectly well in one type of application) might fail in others... even when the changes in the operational environment appear to be negligible.
image shows Megabyte's hot air balloon - click to read the article SSD power down architectures and acharacteristics If you thought endurance was the end of the SSD reliability story - think again. ...read the article
.
pushing the SSD testing rock farther up the hill
Editor:- August 25, 2010 - I'm mostly resistant to the idea of rehashing recent news stories - but yesterday while talking about new SSD technologies a reader asked me to take another look at SNIA's SSD performance testing guidelines - which I reported on a month ago.

I said I had been surprised it took ORGs like SNIA so long to look at these issues - because I had been aware of "Halo effects" in flash SSD benchymarks for years - and commented - "But I guess member led ORGs have a built in lag factor and only move at the speed of the slowest exec members."

The reader - Neal Ekker - whom I knew from his time at Texas Memory Systems - put up a spirited defense for this particular ORG's opus and said...

""...We've all known about the fishy-ness of SSD performance claims for years. But I'd like to draw attention to what an impressive accomplishment the SNIA SSS PTS represents, no matter its technical merits or ramifications. I watched it happen, and I can tell you it was an amazing POLITICAL achievement. And I don't mean that in a negative way. Any time there's more than one person in a room, there's politics. For a collection of engineers representing both their own egos and the interests of their employers to finally agree on even this rather bare-bones beginning standard was just remarkable to observe. I can't begin to give enough credit to some of the chief movers and shakers.

Neal Ekker added - "This is why I want more attention focused on the SSS PTS right now, so we don't lose momentum entirely. There's still plenty of work to be done. We need additional companies and fresh faces and energies to step up and push this rock a little farther up the hill."

Editor's comments:- During the majority of the SSS PTS development Neal Ekker served as the SNIA SSSI Education Committee Chair. He's now a for-hire independent SSD marketing consultant. ...Neal's bio, ...SSS PTS (pdf), Storage People
.
flash SSD capacity - the iceberg syndrome
Have you ever wondered how the amount of flash inside a flash SSD compares to the capacity shown on the invoice?

What you see isn't always what you get.
nothing surprised the penguins - click to read  the article There can be huge variations in different designs as vendors leverage invisible internal capacity to tweak key performance and reliability parameters. ...read the article
.
SSD Data Recovery Concepts and Methodologies
It's hard enough understanding the design of any single SSD. And there are so many different designs in the market.
broken barrel image - click to read this data recovery article If you've ever wondered what it looks like at the other end of the SSD supply chain - when a user has a damaged SSD which contains priceless data with no usable backup - this article - written by Jeremy Brock, President, A+ Perfect Computers - who is one of a rare new breed of SSD recovery experts will give you some idea. read the article
1.0" SSDs 1.8" SSDs 2.5" SSDs 3.5" SSDs rackmount SSDs PCIe SSDs SATA SSDs
SSDs all flash SSDs hybrid drives flash memory RAM SSDs SAS SSDs Fibre-Channel SSDs

StorageSearch.com is published by ACSL