|
by Zsolt Kerekes,
editor - StorageSearch.com
- June 24, 2011
The memory chip count ceiling (and / or
layers count ceiling - see sidebar) around which the
SSD controller IP is
optimized - predetermines the
efficiency of
achieving system-wide goals like
cost,
performance and
reliability. |
|
All enterprise flash SSDs (and those
built using other types of non
volatile memories too) can be said to belong into one of two fundamental
architectural groups which I call - "big" and "small".
(Who said my SSD articles were complicated?)
For
simplicity you can think about the dividing line between these as follows.
- small SSD architecture - the controller is designed to work with a
small number of flash memory chips - 10 or less.
It will work with
more too - but
the
dice have already been rolled to predict everything else which will follow
on from the "small" architecture decision.
- big SSD architecture - the SSD design has been optimized for a
typical installed set of maybe 1,000 memory chips.
It could be a lot
less - maybe even as few as a hundred - or it can be more (many thousands of
chips).
Once again - the "big" architecture design decision
sets the pattern for everything else about the SSD.
In big designs
it's common for the vendors to talk about having multiple SSD controllers.
In
such designs the controllers don't have to be identical.
It's not unusual for there to be a hierarchy of controllers - with
some having more responsibility for low level data movement and others (with
more awareness of the total population of flash) doing different things with
that big picture intellignce. What are the characteristics of the 2
SSD types above?
physical size
Not surprisingly "small"
architecture SSD controllers are the essential building blocks for physically
small SSDs - ranging from SSDs
on a chip - up through to
2.5" and
3.5" form factors.
Due to their small footprint they are the most flexible to deploy. And they
don't stay in those slots as singles. They also can appear in arrays on cards
like PCIe SSDs and in
rackmount SSD
systems too.
Meanwhile - most - but not all - large architecture SSDs
started life in rackmounts and (some) worked their way down into PCIe SSDs too.
The
table below gives you some examples of the best known companies in each of these
segments. |
SSD vendor examples
big vs small flash controller architecture |
small |
Seagate (SandForce (1st and 2nd
generation controllers acquisition), and most controllers used in
consumer,
industrial and
military SSDs |
big |
Baidu's SDF
Software-Defined
Flash (pdf), BiTMICRO
(TALINO), EMC (DSSD and
XtremIO acquisitions), HGST
(Skyera and
Virident Systems
acquisitions), IBM (Texas Memory Systems
acquisition), Mangstor,
Radian,
SanDisk (Fusion-io acquistion),
Violin Memory,
| |
It's interesting to see that in the "big"
set you can see companies in the same set here which are in disjoint sets when
you view them from the
legacy vs new
dynasty classification.
And they are also in different sets
again when you look at the
RAM flash cache
classifications - when SandForce and Fusion-io come together in the "skinny"
category.
Understanding those differences tells you everything you
need to know about the strengths and weaknesses of these products in
different markets and applications.
Of course most of you don't know
enough about computer architecture, performance optimization, flash memory
physics, reliability and the SSD market to model how those factors interact -
and there's no reason why you should. But without understanding those things
you take a big gamble
every time you choose an SSD supplier or company to invest in.
Fortunately
- the law of large numbers means that if enough people make similar choices -
then those become the
right decision - because the market is a good filter to all new design
projects and business plans.
But if you're still trying to figure out
why is the big vs small model can be useful - here's one example.
In
the small model - the controller designer does the best job s/he can to
optimize the performance and reliability of the individual SSD. That's all which
can be done. Because it's sold as a single unit and has to work on its own.
When
another designer comes along and puts a bunch of these small SSDs into an array
(for example in a COTS
rackmount) then the small architecture SSDs become a component inside
someone else's (usually small but next level up) controller. (I call this next
level small - because most RAID
architectures are optimized around 10 drives rather than 1,000 - which takes us
into cloud territory).
Before
I lose you - the disadvantage of stacking up arrays of small
architecture SSDs (compared to large architecture) are:-
In theory and in practise large architecture SSDs are faster and
more reliable than similarly sized arrays assembled from small SSDs.
Despite
that - the small architecture arrays can sometimes cost less - because the small
SSDs work in other applications too (which the large SSDs can't address)
therefore bringing the unit costs and risks down.
big SSD
controller architecture is not the same as "big topology" or "big
data"
SSD topology is independent of SSD controller
architecture. For example Big Topology - can be
- having hundreds of PCIe SSD cards in clusters - as supported at the chip
level by PLX Technology.
But PLX's PCIe fabric chips are used by leading SSD companies in both the small
and large controller architecture camps.
Big Data can be implemented
around either type of SSD controller. But I've shown earlier in this article -
why big data can cost more if it is implemented by small controller
architecture modules - due to the inefficiencies in using raw flash memory
resources.
If you're not very technical and still trying to grasp the
important differences I've defined in this article - here's an analogy you may
find useful. A collection of short stories doesn't work the same way as a novel.
I'll
leave it there for now. | |
.. |
selected reader comments
and responses to the above article |
from
Woody Hutsell
- I think there are subsequently two types of big implementations, ones where
RAID and
cache are
centralized and ones where RAID and cache are decentralized.
The
RamSan-500 is an example
of a big implementation with centralized RAID and cache. Where the
RamSan630 is an example
of one with distributed RAID.
The benefit of a distributed RAID
solution is that it keeps a RAID controller from becoming the system bottleneck
and offers a huge boost to performance. The disadvantage to the distributed
RAID is that the device ends up having different reliability characteristics and
architecture implications than the centralized version people are used to (from
typical storage arrays).
| |
.. |
auto tiering SSDs the SSD Heresies SSD controller chips SSD Jargon Explained SSD Reliability Papers SSDs - the big
market picture this
way to the petabyte SSD SSD's past phantom demons Imprinting the brain of
the SSD Surviving SSD
sudden power loss RAM Cache Ratios in
flash SSDs a
new way of looking at Enterprise SSDs flash SSD
capacity - the iceberg syndrome Are MLC SSDs Safe
in Enterprise Apps? the Problem with
Write IOPS - in flash SSDs SSD Myths and
Legends - "write endurance" Data Integrity
Challenges in flash SSD Design Market Trends in the
Rackmount SSD Market an introduction to SSD
Data Recovery Concepts the
future of enterprise data storage (circa 2020) RAM SSDs versus Flash
SSDs - which is Best? principles of bad
block management in flash SSDs How Bad is - Choosing the
Wrong SSD Supplier? Can you trust flash SSD
performance benchmarks? Clarifying SSD Pricing -
where does all the money go? Why Consumers Can Expect
More Flaky Flash SSDs! how will Memory
Channel flash storage impact PCIe SSDs? Calling for an
End to Unrealistic SSD vs HDD IOPS Comparisons
|
. |
Surviving SSD
sudden power loss |
Why should you care
what happens in an SSD when the power goes down?
This important design
feature - which barely rates a mention in most SSD datasheets and press releases
- has a strong impact on
SSD data integrity
and operational
reliability.
This article will help you understand why some
SSDs which (work perfectly well in one type of application) might fail in
others... even when the changes in the operational environment appear to be
negligible. |
| | |
. |
| |
. |
|
.. |
7 years
later...
Layer-Interleaved RAID in 3D nand stretches reach of
big controller architecture
|
Editor:- In August 2018 - I saw a research paper
which got me thinking that maybe I had to widen my definition of where the
dividing line sits between big and small SSD controller architecture.
The
trigger was LI-RAID which leverages a system view of layers in tall 3D nand. In
reality it's not so much a change of definition - because the idea of "big"
and "small" remain the same. But it's an example which doesn't always
have to exactly fit into the "number of chips" notion which I used in
my original 2D nand article,
Read more about this in
3d
nand and new dimensions in SSD controller architecture (research exploits layer
based differences) | | |
.. |
|
. |
|
. |
Anything you could do
before in a single server - can now be done on tens, hundreds or thousands or
servers - without having to throw away your legacy applications.
|
the Top SSD Companies -
2014 Q3 | | |
. |
RAID is
still mostly small controller architecture too |
When it comes to
hard drives
RAID (the traditional
kind) was better than what it replaced in the 1980s but if you started again
with the internet connectivity and processors we have today and started to
design big arrays of disks from first principles then you wouldn't see just the
RAID systems you see today.
That's because RAID is small controller
architecture too.
It optimizes over maybe 5 to 10 disks.
If,
instead you optimize reliability etc over 100 disks then you get better
efficiency. That's why Google designed its own disk managment system. The cost
savings at the million plus disk level are worthwhile.
The cloud
thinking alternative to RAID arrays is also what you see in systems from
Amplidata and
Pivot3.
The
same principles apply to SSDs.
SandForce's original
controllers optimized around a population of less than 10 flash chips
whereas
Violin optimized around
100 plus.
| | |
. |
|
. |
more SSD design related articles
Adaptive R/W and
DSP in flash SSD IP
Symmetries and
consequences in SSD design
factors which influence
and limit flash SSD performance |
. |
Do you remember the movie
Back to the Future?
If you could go back in time (not to the 1950s but
to the 1970s, 1980s. 1990s and early 2000s) and take with you (not the idea
of skateboarding) but instead a truckload of modern memory chips and SSDs
(along with compatible adapters) what impact would that have?
That
analysis leads onto the consideration of "back from the future"
memory ideas and what their viability could be today. |
are we ready for
infinitely faster RAM?
| | |
. |
|
. |
|
. |
| |