Memory
Channel Storage SSDs
will the new concept fly? - should you book a seat yet? by
Zsolt Kerekes,
editor - April 29, 2013 |
Recently we've
been hearing more details about the implementation plans for a new type of
fast enterprise SSD.
Memory channel storage was devised by Diablo Technologies.
While in stealth mode Diablo had hinted it was planning to use its
assets and expertise in DRAM interface chips and datacomms to design a new
type of enterprise SSD which would couple - the capacity cost advantages of
flash with the low latency characteristics of a RAM "bus" to
create a new type of fast storage device which would be an order of magnitude
faster at accelerating apps than
good old PCIe SSDs.
To make a product like this viable will require world leading skills
in flash controller and management expertise - with particular emphasis on
expertise in enterprise SSD markets.
Diablo last week announced that
it has chosen SMART
Storage Systems
to exclusively supply the necessary flash SSD expertise.
The 2
companies will work together to bring the first products to market in the coming
year. The products will be sold by SMART.
At this point you may well
ask...
What does a
datacomms
rooted company like Diablo (because that's where the founders came from)
know about SSDs?
We'll have to see as details emerge later. However,
in past conversations I've had with people in enterprise SSD array pioneer
Texas Memory Systems
(acquired by IBM) - TMS
had characterized their big controller SSD architecture as being primarily "a
router" which could move data reliably from a SAN into a memory array with
low latency, fast throughput and good
data integrity.
And another place we've been seeing datacomms concepts entering the
SSD market recently is in the field of adaptive DSP and ECC techniques - most
explicitly in the nomenclature of controller and IP products from
DensBits - whose "memory
modem" approach uses the philosophical idea that charge based R/W errors in
flash cells can be modeled and managed in similar ways to signal noise in
comms circuits.
Who's SMART?
(Diablo's flash partner).
If you're not already familiar with SMART -
a quick summary is to say they are a
Top 20 SSD Company,
which is a leading supplier in the
SAS SSD market, and has
its own adaptive
R/W flash ECC IP and its own enterprise SSD controllers. SMART doesn't
have a PCIe SSD product line. So you might think that was a strange partner
choice - but I'll be returning to this question later in this article.
I
spoke recently about the market opportunity and technical challenges posed by
memory channel SSDs with John Scaramuzzo,
President of SMART.
A small part of the article which follows here -
is based on that discussion. But most of the article below is based on my own
thoughts - inspired by the kind of questions which I know many of you would
ask. These speculations are based on my own analysis of the
SSD market.
what's
different between this and past flash SSDs in DRAM packages?
John
and I briefly discussed why you shouldn't confuse memory channel SSDs with past
attempts by memory makers to place flash SSDs in DRAM style DIMM packages.
The
two best known companies who have gone down that route are
Viking and
Micron - in whose DRAM
compatible flash products the flash is used as a backup for DRAM which
triggered into operation by a
drop in the
power supply. There have also been DIMM flash SSDs which were mounted
on the motherboard in a DRAM package - but where the data was connected via
another interface such as SATA
- and not through the DRAM bus.
Another phrase which SMART likes to use
in relation to the Diable technology is "ultra low latency" SSD. John
told me the purpose of the ull SSDs was to provide latency an order of
magnitude faster than existing PCIe SSDs.
ULL SSDs - PCIe SSD
killers or collabrative co-workers?
One question I asked gave me
a picture of how the PCIe SSD market might look - even if we rushed ahead and
assumed that ULL SSDs were successful beyond the wildest dreans of
their creators (which as we'll see later - stretches a lot of assumptions).
I
asked - how many inches in physical distance can the memory channel SSD bus be
routed? And specifically - can it hop off the motherboard?
John said -
no that's not the idea. The current concept is to be operate the same as a DRAM
bus.
So I could immediately see 2 levels of segmentation and
co-existence with PCIe SSDs - which can occur - even if we assume that every
fast server has memory channel SSDs installed.
- 1st level of market co-existence:- is high performance server clusters -
where PCIe SSDs effectively provide the next level of SAN-like data connectivity
and fault
tolerance functionality using PCIe fabric technology.
PCIe fabric
- sharing data between different physical servers at low latency - is
something which PCIe SSDs can do which DRAM memory technology can't (yet).
One
way to think about this is - what's the best way for one memory channel storage
accelerated server to talk to another? The answer is via a switched PCIe SSD
fabric.
- 2nd level of market co-existence - is entry level to mid-range server
performance.
For example - in cost sensitive server markets like
iSCSI SSDs.
That's
because the memory management technology needed to implement a Diablo-style ull
SSD will be more expensive to implement than entry level fast-enough PCIe SSDs -
which don't need such fast SSD
controllers and don't need RAM caches or nvRAM caches - each of which
escalates upwards the floor price of the ull SSD.
The cheapest
fast-enough PCIe SSDs will always be cheaper than the cheapest memory channel
SSDs. That's my analysis of what would happen - even if the new
technology was wildly successful.
John also confirmed that he saw these
as different markets. And for example - at some future date - when there's an
adequate set of industry standards for PCIe SSDs - you can still expect that
SMART might bring to market its own family of PCIe SSDs.
For many of
you - that's the key message from this article - and you don't need to read any
further. I'll say it simply here...
Memory channel SSDs will not
remove the need or market desireability of PCIe SSDs.
In extreme
cases it's possible to imagine servers in which both types of SSDs are operating
(in different functional roles) at the same time. And in my view - the market
opportunity for PCIe SSDs will remain much bigger - because it can be adapted
more economically for a wider spectrum of performance.
This brings us
to the next 2 big questions and problems about bringing ull SSDs to market -
flash technology and controller technology
flash
technology and ull SSDs
Without giving too many details away John
told me that getting the memory technology optimized for the low latency and
high IOPS of ull SSDs was going to be a tough challenge.
If you look at
the endurance
problem for a very fast SSD of this type - the demand pushes you in the
direction of SLC like characteristics.
The latency can only be met by
true DRAM or nvRAM like memory.
John told me - they weren't in this
venture with Diablo only to design a product which worked. The product had to be
affordable too. Therefore it has to be flash rich.
If I step back and
speculate what performance might be needed in this new type of SSD - it's way
beyond what SMART implements in its SAS SSDs.
On the other hand -
adaptive R/W flash IP (of the type which SMART has) can be tweaked to deliver
SLC-like performance.
And if you look at the
rackmount SSD market
- Skyera shows you
can get very high system performance and reliability by using adaptive R/W
alongside nvRAM cache.
In fact - if you look at a technology roadmap
for future flash memory and SSDs - it would be a mistake to launch the new
memory channel SSDs using an SLC based implementation - because apart from the
cost penalties - SLC doesn't have a geometry scalable future. When you shrink
it - it picks up all the baggage of problems of MLC. So MLC and adaptive R/W is
the only way to go. (That's true for
industrial SSDs
too - and not just in the
enterprise flash
market.)
My guess is that Diablo must have realized that when they
were looking at the long range memory future. And that's one of the reasons
they picked SMART.
SMART is currently one of the leading companies
which spans both adaptive R/W technology and has enterprise SSD market
experience but doesn't yet have the inconvenient distraction of having a
PCIe SSD product line.
SMART's controller performance problem
SMART's
current SSD controller technology (as publicly revealed) can simply be
described as fast enough for the purposes of
2.5" SSDs - which
it was designed for - but nowhere near fast enough compared say to fast PCIe
SSDs such as those from
Fusion-io,
Virident or
Micron.
So
how's SMART doing to fill the controller gap to deliver fast ull SSDs?
This
is my speculation here and not based on anything said to me by anyone in SMART.
But there are several ways you can look at this problem....
Scaling
and bundling the technology which aleady exists.
For example
OCZ was the first company
which demonstrated that you can design very fast throughput PCIe SSDs by using
arrays of controllers which had been designed for 2.5" SSDs. In OCZ's case
their original PCIe SSDs included arrays of
SandForce controllers -
and are now available using arrays of their own OCZ controllers.
This
approach (using multiple Guardian controllers) would be feasible for SMART to
use in new ull SSD too. It would need a
fat RAM cache
to deal with latency and endurance peaks. But it's not the most
efficient solution.
When you get into the fast enterprise SSD club you start seeing
controller designs which have similar
characteristics.
These
big controller
designs tend to split the controller design into different partitions - an
intelligence and host / apps leaning side and a distributed flash management
I/O side.
My guess is that new ull products from SMART could be
eASIC or FPGA heavy implementations rather than simply arrays of their Guardian
controllers. That's because when you own the algorithms and IP - you aren't tied
to past implementations of how you packaged it. You can implement the functions
in any way that makes best sense for the market you're going for.
It
doesn't really matter which way SMART does it as long as it's fast enough and
is affordable.
The host interface and high end memory controller /
CPU IP side of things is where it would be natural to assume that Diablo will
be contriibuting their own IP.
how successful will memory channel
SSDs / ull SSDs be?
This is where we can return to some important
strategic speculation.
First of all - in looking at what problem these
new SSDs might solve which isn't being solved today - an important thing to
realize is that the competitive environment isn't the same as when Fusion-io
started shipping PCIe SSDs back in
2007.
The
new ull SSDs have to fight for a viable toehold in an already pre-existing and
very sophisticated market
for enterprise SSDs.
And even though the state of the SSD software
leaves many
questions open to debate the importance of
software in the SSD market
as a sales accelerator is well established and well understood - even if mostly
by the high sums that such companies have been
acquired for -
relative to their modest revenue achevements as standalone ISVs.
You
can get some interesting insights into this by using the mental trick I call
SSD market
boundaries analysis and rephrasing some of the assumptions about the new
technology to get these sanity checks.
- if Fusion-io was
the flash technology partner for Diablo (instead of SMART - and forgetting about
the very material facts that
FIO doesn't admit
publicly to having the same flash technology and also has a
very different approach to
everyone else when it comes to SSD controller design) what could Fusion-io
deliver with a memory channel connected SSD that it can't already deliver with a
PCIe SSD?
FIO has demonstrated in market shipping products that
shrinking down latency requires more than hardware design.
You
need to remove layers of hard drive related junk which is hard coded into many
layers of enterprise systems software. But when you do this - sometimes by
introducing new side-stepping APIs - you can get apps performance which is 10x
faster than raw speedup which you get from simply running legacy software faster
on SSDs.
This is something which academic researches at the
Non Volatile Systems Lab UCSD have also
written about too.
Fusion-io is already well down this learning curve
- and if they had the benefits of a lower latency to memory technology - my
guess is they would be best placed to further exploit it. Having said that,
however, until we know what the Diablo/SMART memory channel controller latency
delivers - we can't be sure how it will compare to what FIO already delivers
with the worse latency of PCIe but better latency of a flash translation layer
in which the apps server is the same CPU that manages the flash.
- if SMART were
simply to come to market with a new fast PCIe SSD (instead of a new type of
SSD) it would struggle and probably fail to establish a market for such a
product. If and when SMART does introduce its own PCIe SSD my guess is it would
be a fast-enough product optimized for cost and efficiency rather than speed.
But that's another story.
Without a software base - simply having a
lower latency PCIe SSD wouldn't get you many customers today. The
memory channel SSD concept - if the product is fast enough and affordable - will
find a market if it can exceed the kind of server cost / performance which
you can only get today by using Fusion-io's PCIe SSDs with Fusion-io's APIs -
but in a business model which is independent of Fusion-io.
One of the
problems for the Diablo / SMART ull SSD market is the lack of a spohisticated
SSD ecosystem. That's a problem which faces any new type of SSD. My guess is
that a critical part of Diablo's business model will have to be invested in
showing how the new SSD type can side step those difficult "SSD" type
integration questions - by delivering a product which looks to applications
transparently like a bigger cheaper type of DRAM.
The idea of selling
flash as a memory tier has long been on the agenda of leading PCIe SSD
companies like
Fusion-io and
Virident - but
the physics of flash and the characteristics of the the interface have meant
that it's not been possible to do this without a lot of software being
introduced into the mix. And the enterprise market is traditionally cautious
about relying on single source solutions.
So - who are the first
customers for these new ull SSDs likely to be?
Until the first
products come to market we have to guess at their possible characteristics.
If you've been following my flow until now - my guess is the first
memory channel SSDs will need to be fast SSD with apps acceleration in a
similar class to using Fusion-io PCIe SSDs (and APIs).
But the memory
channel SSDs will have a completely different software architecture. The
possibilities range from a very light software support set in which the new SSDs
are initiated in the server to look like a big RAM. Or maybe some SSD-like
software in which the modules look like a traditional SAS drive.
Or
the details may be completely different.
Who would be the ideal
customers for such products? - It's going to be very tech savvie customers for
whom pushing the boundaries of performance is worth the risk of investing in new
technology because they use or sell thousands of servers. So it will be
dark matter SSD users,
SSD appliance makers and traditional server oems.
conclusion
It's
always fun to speculate about the likely impact of new SSD technologies and
changes in market directions.
As the above article is mostly based on
speculating about a single new product line it's almost certain that in many
detailed respects it will be wrong.
Nevertheless I think that the
safe conclusion - which is almost where I started - is that even if memory
channel / ull SSDs are successful in the market there are many ways to see
how in the next 2-3 years they can still coexist with several different types
of PCIe SSDs.
For related info see these articles and guides:-
PCIe SSDs news Memory Channel SSD
news
SSD controllers and flash
IP after
AFAs - what's the next box? miscellaneous
consequences of the 2017 memory shortages where are we
heading with memory intensive systems and software?
| | |
| . |
 |
| . |
|

| |
.... |
|
|
.. |
 |
.. |
|
|
.
.. |
 |
|
.
.. |
|
|
.
.. |
That next step - when
users make the switch to newer generations of software - means not only do they
need less servers - but they don't need as many SSDs as they did in an earlier
phase of SSD market adoption either.
Some enterprise vendors have
already seen the painful effects - the most notable being Fusion-io.
However
it will affect all enterprise SSD vendors at some time or other. |
| meet Ken - and the
enterprise SSD software event horizon | | |
.
.. |
|
|
.
.. |
| "The risk of data
corruption from power cycling isn't a random, unforseeable event. It's a direct
result of choices made (or not made) when that SSD was designed... " |
| Surviving SSD
sudden power loss | | |
.
.. |
|
|
.
.. |
|
|
.
.. |
 |
.
.. |
|
|
.
.. |
|
|
.
.. |
 |
.
.. |
| what's in a number? |
| SSDserver rank is a
latency based configuration metric - proposed as a new standard by StorageSearch.com - which can
tersely classify any enterprise server - as seen from an SSD software
perspective - by a single lean number rating from 0 to 7. ...read the article | | |
.
.. |
| if the bottlenecks in your
server system weren't originally caused by the throughput, latency and IOPS of
your storage system - but maybe by some badly written software or insufficient
network bandwidth - then the SSDs won't solve the problems - they may just
make them more visible. |
| will SSDs end my
server bottlenecks? | | |
| |