click to visit home page
leading the way to the new storage frontier .....
SSDs over 163  current & past oems profiled
SSD news ..
SSD SoCs controllers
SSD controller chips ..
SSD symmetries article
SSD symmetries ..
image shows megabyte waving the winners trophy - there are over 200 SSD oems - which ones matter? - click to read article
top 20 SSD oems ..
SSD myths - write endurance
SSD endurance ..

Size matters!

Big versus Small in SSD controller architecture

by Zsolt Kerekes, editor - - June 24, 2011

The memory chip count ceiling (and / or layers count ceiling - see sidebar) around which the SSD controller IP is optimized - predetermines the efficiency of achieving system-wide goals like cost, performance and reliability.

All enterprise flash SSDs (and those built using other types of non volatile memories too) can be said to belong into one of two fundamental architectural groups which I call - "big" and "small".

(Who said my SSD articles were complicated?)

For simplicity you can think about the dividing line between these as follows.
  • small SSD architecture - the controller is designed to work with a small number of flash memory chips - 10 or less.

    It will work with more too - but the dice have already been rolled to predict everything else which will follow on from the "small" architecture decision.
  • big SSD architecture - the SSD design has been optimized for a typical installed set of maybe 1,000 memory chips.

    It could be a lot less - maybe even as few as a hundred - or it can be more (many thousands of chips).

    Once again - the "big" architecture design decision sets the pattern for everything else about the SSD.

    In big designs it's common for the vendors to talk about having multiple SSD controllers.

    In such designs the controllers don't have to be identical.

    It's not unusual for there to be a hierarchy of controllers - with some having more responsibility for low level data movement and others (with more awareness of the total population of flash) doing different things with that big picture intellignce.
What are the characteristics of the 2 SSD types above?

physical size

Not surprisingly "small" architecture SSD controllers are the essential building blocks for physically small SSDs - ranging from SSDs on a chip - up through to 2.5" and 3.5" form factors. Due to their small footprint they are the most flexible to deploy. And they don't stay in those slots as singles. They also can appear in arrays on cards like PCIe SSDs and in rackmount SSD systems too.

Meanwhile - most - but not all - large architecture SSDs started life in rackmounts and (some) worked their way down into PCIe SSDs too.

The table below gives you some examples of the best known companies in each of these segments.
SSD vendor examples
big vs small flash controller architecture
small Seagate (SandForce (1st and 2nd generation controllers acquisition), and most controllers used in consumer, industrial and military SSDs
big Baidu's SDF Software-Defined Flash (pdf), BiTMICRO (TALINO), EMC (DSSD and XtremIO acquisitions), HGST (Skyera and Virident Systems acquisitions), IBM (Texas Memory Systems acquisition), Mangstor, Radian, SanDisk (Fusion-io acquistion), Violin Memory,
It's interesting to see that in the "big" set you can see companies in the same set here which are in disjoint sets when you view them from the legacy vs new dynasty classification.

And they are also in different sets again when you look at the RAM flash cache classifications - when SandForce and Fusion-io come together in the "skinny" category.

Understanding those differences tells you everything you need to know about the strengths and weaknesses of these products in different markets and applications.

Of course most of you don't know enough about computer architecture, performance optimization, flash memory physics, reliability and the SSD market to model how those factors interact - and there's no reason why you should. But without understanding those things you take a big gamble every time you choose an SSD supplier or company to invest in.

Fortunately - the law of large numbers means that if enough people make similar choices - then those become the right decision - because the market is a good filter to all new design projects and business plans.

But if you're still trying to figure out why is the big vs small model can be useful - here's one example.

In the small model - the controller designer does the best job s/he can to optimize the performance and reliability of the individual SSD. That's all which can be done. Because it's sold as a single unit and has to work on its own.

When another designer comes along and puts a bunch of these small SSDs into an array (for example in a COTS rackmount) then the small architecture SSDs become a component inside someone else's (usually small but next level up) controller. (I call this next level small - because most RAID architectures are optimized around 10 drives rather than 1,000 - which takes us into cloud territory).

Before I lose you - the disadvantage of stacking up arrays of small architecture SSDs (compared to large architecture) are:- In theory and in practise large architecture SSDs are faster and more reliable than similarly sized arrays assembled from small SSDs.

Despite that - the small architecture arrays can sometimes cost less - because the small SSDs work in other applications too (which the large SSDs can't address) therefore bringing the unit costs and risks down.

big SSD controller architecture is not the same as "big topology" or "big data"

SSD topology is independent of SSD controller architecture. For example Big Topology - can be
  • having hundreds of PCIe SSD cards in clusters - as supported at the chip level by PLX Technology. But PLX's PCIe fabric chips are used by leading SSD companies in both the small and large controller architecture camps.
Big Data can be implemented around either type of SSD controller. But I've shown earlier in this article - why big data can cost more if it is implemented by small controller architecture modules - due to the inefficiencies in using raw flash memory resources.

If you're not very technical and still trying to grasp the important differences I've defined in this article - here's an analogy you may find useful. A collection of short stories doesn't work the same way as a novel.

I'll leave it there for now.
selected reader comments and responses to the above article

from Woody Hutsell - I think there are subsequently two types of big implementations, ones where RAID and cache are centralized and ones where RAID and cache are decentralized.

The RamSan-500 is an example of a big implementation with centralized RAID and cache. Where the RamSan630 is an example of one with distributed RAID.

The benefit of a distributed RAID solution is that it keeps a RAID controller from becoming the system bottleneck and offers a huge boost to performance. The disadvantage to the distributed RAID is that the device ends up having different reliability characteristics and architecture implications than the centralized version people are used to (from typical storage arrays).

auto tiering SSDs
the SSD Heresies
SSD controller chips
SSD Jargon Explained
SSD Reliability Papers
SSDs - the big market picture
this way to the petabyte SSD
SSD's past phantom demons
Imprinting the brain of the SSD
Surviving SSD sudden power loss
RAM Cache Ratios in flash SSDs
a new way of looking at Enterprise SSDs
flash SSD capacity - the iceberg syndrome
Are MLC SSDs Safe in Enterprise Apps?
the Problem with Write IOPS - in flash SSDs
SSD Myths and Legends - "write endurance"
Data Integrity Challenges in flash SSD Design
Market Trends in the Rackmount SSD Market
an introduction to SSD Data Recovery Concepts
the future of enterprise data storage (circa 2020)
RAM SSDs versus Flash SSDs - which is Best?
principles of bad block management in flash SSDs
How Bad is - Choosing the Wrong SSD Supplier?
Can you trust flash SSD performance benchmarks?
Clarifying SSD Pricing - where does all the money go?
Why Consumers Can Expect More Flaky Flash SSDs!
how will Memory Channel flash storage impact PCIe SSDs?
Calling for an End to Unrealistic SSD vs HDD IOPS Comparisons
Surviving SSD sudden power loss
Why should you care what happens in an SSD when the power goes down?

This important design feature - which barely rates a mention in most SSD datasheets and press releases - has a strong impact on SSD data integrity and operational reliability.

This article will help you understand why some SSDs which (work perfectly well in one type of application) might fail in others... even when the changes in the operational environment appear to be negligible.
image shows Megabyte's hot air balloon - click to read the article SSD power down architectures and acharacteristics If you thought endurance was the end of the SSD reliability story - think again. the article

storage search banner

image shows Megabyte peering into a long line of storage barrels - image for this article Megabyte was worried about the random access time in his new bulk storage multi-petabyte SSD solution.
7 years later...

Layer-Interleaved RAID in 3D nand
stretches reach of big controller architecture
Editor:- In August 2018 - I saw a research paper which got me thinking that maybe I had to widen my definition of where the dividing line sits between big and small SSD controller architecture.

The trigger was LI-RAID which leverages a system view of layers in tall 3D nand. In reality it's not so much a change of definition - because the idea of "big" and "small" remain the same. But it's an example which doesn't always have to exactly fit into the "number of chips" notion which I used in my original 2D nand article,

Read more about this in 3d nand and new dimensions in SSD controller architecture (research exploits layer based differences)

SSD ad - click for more info

This is for those of you who know in your bones that to get the SSD you want - you need to design your own controller.
aspects of SSD design - processors used in SSDs

Anything you could do before in a single server - can now be done on tens, hundreds or thousands or servers - without having to throw away your legacy applications.
the Top SSD Companies - 2014 Q3

RAID is still mostly small controller architecture too
When it comes to hard drives RAID (the traditional kind) was better than what it replaced in the 1980s but if you started again with the internet connectivity and processors we have today and started to design big arrays of disks from first principles then you wouldn't see just the RAID systems you see today.

That's because RAID is small controller architecture too.

It optimizes over maybe 5 to 10 disks.

If, instead you optimize reliability etc over 100 disks then you get better efficiency. That's why Google designed its own disk managment system. The cost savings at the million plus disk level are worthwhile.

The cloud thinking alternative to RAID arrays is also what you see in systems from Amplidata and Pivot3.

The same principles apply to SSDs. SandForce's original controllers optimized around a population of less than 10 flash chips whereas Violin optimized around 100 plus.


more SSD design related articles

Adaptive R/W and DSP in flash SSD IP

Symmetries and consequences in SSD design

factors which influence and limit flash SSD performance

Do you remember the movie Back to the Future?

If you could go back in time (not to the 1950s but to the 1970s, 1980s. 1990s and early 2000s) and take with you (not the idea of skateboarding) but instead a truckload of modern memory chips and SSDs (along with compatible adapters) what impact would that have?

That analysis leads onto the consideration of "back from the future" memory ideas and what their viability could be today.
are we ready for infinitely faster RAM?

SSD ad - click for more info

flash SSD capacity - the iceberg syndrome
Have you ever wondered how the amount of flash inside a flash SSD compares to the capacity shown on the invoice?

What you see isn't always what you get.
nothing surprised the penguins - click to read  the article There can be huge variations in different designs as vendors leverage invisible internal capacity to tweak key performance and reliability parameters. the article
. is published by ACSL