Texas Memory Systems
- talks about the design of fast high availability SSDs by Zsolt Kerekes,
editor - StorageSearch.com
- December 8, 2011 |
... |
I've been talking to
TMS every month for
over 10 years - and I've been writing about their memory appliances since the
early 1990s - so you might think that I would have run out of things to say by
now.
But when they launched their first tier 1 storage SSD
this month (see
more details below) I spoke to
Holly Frost, CEO
and Dan Scheel,
President of Texas Memory Systems, and Erik Eyberg, Sr. Analyst -
about their new SSD, what they think about what's going on in the SSD
market, and the philosophy that steers the design of their SSDs.
I
started by saying that I've been writing about their memory systems since the
early 1990s - I think the earliest model I knew about was the SAM 2000 - which
was a multi-ported shared access memory system - which was used to
interconnect different computers at memory speeds.
Holly Frost said -
in some ways you could say they were doing the same sort of things now -
enabling servers to share data at memory speeds. The difference now was
they did things in chips - which used to take up whole boards - and the data
speeds today are thousands of times faster.
What's new about the TMS's
new SSD?
You can see the details elsewhere on this page - but Holly
Frost told me that his policy is that a new SSD has to include at least 2 new
things to make it worth launching.
In this case - the 2 things are:-
- the reliability architecture - with no single point of failure, and
- hot swappable flash modules. (This is a new feature for TMS - but not for
the industry.)
|
... |
will new
RamSan rattle Violin? |
Editor:- December 6, 2011 -
Texas Memory Systems
today
announced
imminent availability of the
RamSan-720
- a 4 port (FC/IB) 1U
rackmount SSD
which provides 10TB of usable 2D (FPGA implemented)
RAID protected and hot
swappable - SLC
capacity with 100/25 microseconds R/W latency (with all protections in
place) delivering 400K IOPS (4KB), 5GB/s throughput - with no single point of
failure (at $20K/TB approx list).
The new SSD uses a
regular RAM cache
flash architecture which in the event of
sudden power
loss has an ultra reliable battery array which holds up the SSD power for 30
seconds while automatically backing
up all data in flight and translation tables to nonvolatile flash storage. On
power up - the SSD is ready for full speed operation in less than a minute.
Aimed
at HA tier 1 storage markets - the RamSan-720 consumes only 300-400 W - which
makes it practical for high end users to install nearly 1/2
petabyte of SSD
storage in a single cabinet - without having to worry about the secondary
reliability and
data integrity
risks which can arise from high temperature build-ups in such
enclosures.
Editor's comments:- I've
been talking to TMS every month for over 10 years - and I've been writing
about their memory appliances since the early 1990s - so you might think that I
would have run out of things to say by now. When I saw the preliminary specs
for the new RS-720 - the features which jumped out at me were:-
- the low R/W latency for this class of SPOF product. Which is 2x as
good as the next fastest product I know - the 6000 series fron
Violin - and several
times faster than some other tier 1 SSD vendors such as
Kaminario and
Huawei Symantec
- the high storage density - over 3x better than
Violin delivers in SLC -
and close to the usable RAIDed capacity that a
Fusion-io 1U server
can deliver in MLC when using Octal.
A few days ago I spoke to
Holly Frost, CEO
and Dan Scheel,
President of Texas Memory Systems about their new SSD, what they think about
what's going on in the SSD market, and the philosophy that steers the design
of their SSDs. In a hour long discussion I learned enough new stuff to write
several new articles. So instead of condensing it down here into a couple of
bullet points - I'm going to give you the benefit of what I learned in a
new article tomorrow called - "StorageSearch talks SSD with Holly Frost."
Going
back to my headline - will new RamSan rattle Violin? |
|
 |
I'm sure that Violin would
say that this simply validates what they are doing (and shipping) already - and
that the enterprise SSD market is big enough for all vendors in this category
to keep
growing at a healthy clip. It make you wonder how much a company like TMS
might be worth too... | | | |
... |
why do people need a new
SSD like this one? |
Dan Scheel said that in the
past - customers had used the company's SSDs to accelerate a single function in
their company.
SSDs had been expensive - and fast SSDs were used in
apps where the raw speed translated directly to a a business benefit like
greater revenue. Managing the fault tolerant architecture in these systems was
done using a variety of (slow) methods - but it was more important to keep the
apps running fast - when running - than to guarantee that fail-over would be
just as fast. If there was a failure - and a reduction in speed while failover
occurred - the impact was to a single isolated business function.
The
high availability features in the RamSan-720 means it can go into any kind of
application that a customer might have - and because it is reliable (with
overkill HA) - the customer didn't have to worry about what jobs the SSD was
used on. It would be an economic choice which could be used for a wide range of
functions which required high performance.
So it was a new market for
SSDs - going into places which had traditionally used tier 1 HDD storage.
But
now the SSDs were just as simple to deploy, faster, cheaper and used less power
and less physical space. | | |
... |
pricing strategy |
In the past TMS was viewed as a
supplier of the fastest SSDs - but also the most expensive as well.
I
said the pricing of the RamSan-720 looked aggressive
compared to other
modern SSDs (not just dinosaur HDD systems).
Holly Frost said that
because they're using FPGAs to replace processors. memory etc found in some
other systems based around servers (I guess a prime example would be the server
based Kaminario) that
means most of the recurring system cost is the cost of the memory chips. TMS's
varaiable stripe and big architecture gives them advantages here too - because
they don't need to start with so much raw capacity to deliver usable protected
capacity. (Despite the fact they also include hot spares too - which I forgot to
mention above.)
Anyway - the end result is - a product that's smaller,
uses less power and costs less to make. So they plan to use their cost
advantages to be more competitive. | | |
. |
what's happening with RAM
SSDs? |
TMS is still selling them - and
when they look at the numbers - they are still holding up.
A year ago
TMS told me that flash
SSDs were then 70% of their business - and nowadays flash is where the
growth is.
Holly Frost said that's why most of their product
announcements in recent years had bee n flash based SSDs.
But he agreed
that the RAM SSD market
would never go completely away.
There were still some apps where having
the lowest latency mattered - and customers could still get faster read latency
through a rackmount RAM SSD than from a bus based flash SSD.
Spook
stuff? - I said.
Holly Frost said - "I could tell you but then
I'd have to kill you." | | |
. |
how much is Texas Memory
Systems worth? |
This has been a
year floodlit by SSD company IPOs and acquisitions - and there's been a lot
of discussion in some quarters about how much any particular SSD company might
be worth.
I revealed to TMS that I had several years ago been asked how
much I thought TMS would cost to acquire - and at the time I told the reader who
asked me I doubted if TMS would be interested.
"The difficult
thing today" - I said "would be knowing what value to place on an
SSD company that isn't owned by VCs and which might actually be making some
money?"
That got a laugh but wasn't what they wanted to talk to
me about. So I didn't pursue it. But I can picture some of my institutional
readers right now - setting up spreadsheets which add a fraction of
Violin to another
fraction of Fusion-io.
I'm guessing that's how these sums are done.
...Later:-
within a year of this interview TMS did in fact join the list of
acquired SSD
companies - having been bought by
IBM | | |
. |
 |
. |
the SSD market is an
exciting place to be working right now |
Holly Frost said the
thing he likes about the SSD market is that it's not a boring subject.
He admires the ingenuity of competitors - and the different solutions that
SSD companies have created to solve similar applications problems.
There
are so many interesting new developments being done in the industry with new
architectures and technologies that he can't imagine working anywhere else.
I think that's a sentiment with which I for one - and many of you
too - can easily agree. | | |
. |
 |
. |

| |
... |
more suggested articles
An SSD
conversation with Tegile how fast can your SSD
run backwards? 7
SSDs silos for the pure SSD datacenter exciting new
directions in rackmount SSDs playing the
enterprise SSD box riddle game Exiting the
Astrological Age of enterprise SSD pricing |
... |
|
Here's a sprinkling of the topics
discussed when TMS spoke to the mouse site. | |
... |
the design philosophy
behind fastest SSDs |
Holly Frost solves many
complex controller problems with custom designed chips (ASICs or FPGAs) instead
of leveraging standard processors and
software.
He
said the results are generally faster operation, lower power consumption and
less space taken up to do the job.
He said you can do a lot with FPGAs
today - and they have the advantage (unlike ASICs) of fast turn around time
from redesign to ship. So sometimes TMS launches a product with one design - and
if they think of a better way to do things a bit later - they can simply
reprogram the FPGAs. | | |
... |
 |
... |
low power consumption is a
reliability feature |
Holly Frost said flash
endurance declines at high operating temperatures - so the design choices they
made in the RamSan-720 - using their design expertise to put functions in
chips (FPGAs) instead of the "easy way" of using industry standard
processors and Linux software - not only gives them space and performance
advantages - but means they use less power too.
The RamSan-720 uses
approximately 4x less watts than the 6216 from
Violin - for the same
SLC capacity.
That means customer cabinets populated by RamSan-720
systems run cooler and potentially have better MTBF. | | |
. |
2D RAID in the RamSan-720 |
There are 2 RAID systems in
this SSD - which connect the dual ported flash modules to the backplane.
- RAID-5 across the flash chips within each module (using the same variable
stripe technology which was introduced in previous 7th generation RamSans) and
- RAID across the flash modules too
It's the RAIDing across the
flash modules - which necessitated the design of a new FPGA chip. because any
other way of adding RAID controller functions at this point would have added too
much latency.
TMS has got a pretty picture which shows the elegant
simplicity of this - which you can see below. |
 | | |
. |
no single point of failure
- except... |
I suggested that the SPOF
concept in this (and competing systems) was in some ways misleading.
I
said - "I bet if you immersed the whole rack in a tub of water - it would
fail. And that's a single point of failure."
Holly Frost agreed
that SPOF doesn't cover disasters like that.
But that reminded him of
something that happened with a customer in the early 1990s.
They
returned one of TMS's reackmount SSDs and aked for it to be replaced under
warranty.
It was unusual so he was curious to see what they would find
when they opened the rack up.
He sniffed around - expecting to get a
whiff of burnt components (fast chips ran very hot in those days).
There
was a funny smell. It was strong and damp - but not a living smell.
It
smelled like seawater.
As it happens - the customer was the Navy.
Holly
Frost said - when salt water gets into a system like this - it's not going to be
reliable even if youi change the faulty chips.
So TMS replaced the unit
(under warranty).
Returning to my suggestion about the tub of water -
Holly said - take care to unplug the cables first. | | |
. |
 |
. |
is there a cross-over
between customers of TMS rackmount and PCIe SSDs? |
That's not the way customers
are thinking about it right now.
Customers think - I have a performance
problem here - how can I best fix that?
Some customers prefer SSD
cards, while others will choose SSD racks - for what looks - from the outside to
be very similar situations.
But TMS has observed that the same customer
may buy both types of SSD. That seemed to support my own view that selling
more of one doesn't
decrease the eventual market size of the other. | | |
. |
 |
. |
controller in the
RamSan-720 |
The host interface and router
to the flash array uses the same 2 stage (large architecture)
SSD controller architecture as other recent 7th generation RamSan products -
like the
RamSan-710.
But
in addition there's a multi-port low latency
RAID controller which sits
between these - as shown in the schematic elsewhere in this article. | | |
. |
batteries and sudden SSD
power loss |
Erik Eyberg said he'd
read my article on RAM
cache flash architectures and told me that their new SSD has gigabytes of
RAM cache - which puts it in the "regular" category. The RAM cache
speeds up random writes, helps with flash endurance etc.
That led onto
the subject of coping with
sudden SSD
power loss. The new SSD has an ultra reliable battery array which holds up
the SSD power for 30 seconds while automatically
backing up all data in
flight and translation tables to nonvolatile flash storage. On power up - the
SSD is ready for full speed operation in less than a minute.
I said
that because of the long experience TMS had - shipping RAM based SSDs - they
probably knew some good battery suppliers.
At that point Holly Frost
said - yes - he had learned more about batteries and how they can fail than he
ever thought he wanted to know.
The batteries they use now have
technology which was designed to meet the reliability needs of NASA. | | |
. |
|
. | |