Texas Memory Systems - talks about
the design of fast high availability SSDs

more suggested articles

how fast can your SSD run backwards?
7 SSDs silos for the pure SSD datacenter
exciting new directions in rackmount SSDs
image for the article talks SSD with  David Flynn, CEO  Fusion-io
Megabyte spent another pleasant day
shooting the breeze about SSDs.

Here's a sprinkling of the topics discussed when TMS spoke to the mouse site.
the design philosophy behind fastest SSDs
Holly Frost solves many complex controller problems with custom designed chips (ASICs or FPGAs) instead of leveraging standard processors and software.

He said the results are generally faster operation, lower power consumption and less space taken up to do the job.

He said you can do a lot with FPGAs today - and they have the advantage (unlike ASICs) of fast turn around time from redesign to ship. So sometimes TMS launches a product with one design - and if they think of a better way to do things a bit later - they can simply reprogram the FPGAs.
low power consumption is a reliability feature
Holly Frost said flash endurance declines at high operating temperatures - so the design choices they made in the RamSan-720 - using their design expertise to put functions in chips (FPGAs) instead of the "easy way" of using industry standard processors and Linux software - not only gives them space and performance advantages - but means they use less power too.

The RamSan-720 uses approximately 4x less watts than the 6216 from Violin - for the same SLC capacity.

That means customer cabinets populated by RamSan-720 systems run cooler and potentially have better MTBF.
2D RAID in the RamSan-720
There are 2 RAID systems in this SSD - which connect the dual ported flash modules to the backplane.
  • RAID-5 across the flash chips within each module (using the same variable stripe technology which was introduced in previous 7th generation RamSans) and
  • RAID across the flash modules too
It's the RAIDing across the flash modules - which necessitated the design of a new FPGA chip. because any other way of adding RAID controller functions at this point would have added too much latency.

TMS has got a pretty picture which shows the elegant simplicity of this - which you can see below.
2D RAID scheme in the RamSan-70
no single point of failure - except...
I suggested that the SPOF concept in this (and competing systems) was in some ways misleading. I said - "I bet if you immersed the whole rack in a tub of water - it would fail. And that's a single point of failure."

Holly Frost agreed that SPOF doesn't cover disasters like that.

But that reminded him of something that happened with a customer in the early 1990s. They returned one of TMS's reackmount SSDs and aked for it to be replaced under warranty.

It was unusual so he was curious to see what they would find when they opened the rack up.

He sniffed around - expecting to get a whiff of burnt components (fast chips rand very hot in those days).

There was a funny smell. It was strong and damp - but not a living smell.

It smelled like seawater.

As it happens - the customer was the Navy.

Holly Frost said - when salt water gets into a system like this - it's not going to be reliable even if youi change the faulty chips.

So TMS replaced the unit (under warranty).

Returning to my suggestion about the tub of water - Holly said - take care to unplug the cables first.
is there a cross-over between customers of TMS rackmount and PCIe SSDs?
That's not the way customers are thinking about it right now.

Customers think - I have a performance problem here - how can I best fix that?

Some customers prefer SSD cards, while others will choose SSD racks - for what looks - from the outside to be very similar situations.

But TMS has observed that the same customer may buy both types of SSD. That seemed to support my own view that selling more of one doesn't decrease the eventual market size of the other.
controller in the RamSan-720
The host interface and router to the flash array uses the same 2 stage (large architecture) SSD controller architecture as other recent 7th generation RamSan products - like the RamSan-710.

But in addition there's a multi-port low latency RAID controller which sits between these - as shown in the schematic elsewhere in this article.
batteries and sudden SSD power loss
Erik Eyberg said he'd read my article on RAM cache flash architectures and told me that their new SSD has gigabytes of RAM cache - which puts it in the "regular" category. The RAM cache speeds up random writes, helps with flash endurance etc.

That led onto the subject of coping with sudden SSD power loss. The new SSD has an ultra reliable battery array which holds up the SSD power for 30 seconds while automatically backing up all data in flight and translation tables to nonvolatile flash storage. On power up - the SSD is ready for full speed operation in less than a minute.

I said that because of the long experience TMS had - shipping RAM based SSDs - they probably knew some good battery suppliers.

At that point Holly Frost said - yes - he had learned more about batteries and how they can fail than he ever thought he wanted to know. The batteries they use now have technology which was designed to meet the reliability needs of NASA.
For related info take a look at - Texas Memory Systems - editor mentions on or the links below
Editor:- December 8, 2011 - I've been talking to TMS every month for over 10 years - and I've been writing about their memory appliances since the early 1990s - so you might think that I would have run out of things to say by now.

But when they launched their first tier 1 storage SSD this month (see more details below) I spoke to Holly Frost, CEO and Dan Scheel, President of Texas Memory Systems, and Erik Eyberg, Sr. Analyst - about their new SSD, what they think about what's going on in the SSD market, and the philosophy that steers the design of their SSDs.

I started by saying that I've been writing about their memory systems since the early 1990s - I think the earliest model I knew about was the SAM 2000 - which was a multi-ported shared access memory system - which was used to interconnect different computers at memory speeds.

Holly Frost said - in some ways you could say they were doing the same sort of things now - enabling servers to share data at memory speeds. The difference now was they did things in chips - which used to take up whole boards - and the data speeds today are thousands of times faster.

What's new about the TMS's new SSD?

You can see the details elsewhere on this page - but Holly Frost told me that his policy is that a new SSD has to include at least 2 new things to make it worth launching.

In this case - the 2 things are:-
  • the reliability architecture - with no single point of failure, and
  • hot swappable flash modules. (This is a new feature for TMS - but not for the industry.)
will new RamSan rattle Violin?
Editor:- December 6, 2011 - Texas Memory Systems today announced imminent availability of the RamSan-720 - a 4 port (FC/IB) 1U rackmount SSD which provides 10TB of usable 2D (FPGA implemented) RAID protected and hot swappable - SLC capacity with 100/25 microseconds R/W latency (with all protections in place) delivering 400K IOPS (4KB), 5GB/s throughput - with no single point of failure (at $20K/TB approx list).

The new SSD uses a regular RAM cache flash architecture which in the event of sudden power loss has an ultra reliable battery array which holds up the SSD power for 30 seconds while automatically backing up all data in flight and translation tables to nonvolatile flash storage. On power up - the SSD is ready for full speed operation in less than a minute.

Aimed at HA tier 1 storage markets - the RamSan-720 consumes only 300-400 W - which makes it practical for high end users to install nearly 1/2 petabyte of SSD storage in a single cabinet - without having to worry about the secondary reliability and data integrity risks which can arise from high temperature build-ups in such enclosures.

Editor's comments:- I've been talking to TMS every month for over 10 years - and I've been writing about their memory appliances since the early 1990s - so you might think that I would have run out of things to say by now. When I saw the preliminary specs for the new RS-720 - the features which jumped out at me were:-
  • the low R/W latency for this class of SPOF product. Which is 2x as good as the next fastest product I know - the 6000 series fron Violin - and several times faster than some other tier 1 SSD vendors such as Kaminario and Huawei Symantec
  • the high storage density - over 3x better than Violin delivers in SLC - and close to the usable RAIDed capacity that a Fusion-io 1U server can deliver in MLC when using Octal.
A few days ago I spoke to Holly Frost, CEO and Dan Scheel, President of Texas Memory Systems about their new SSD, what they think about what's going on in the SSD market, and the philosophy that steers the design of their SSDs. In a hour long discussion I learned enough new stuff to write several new articles. So instead of condensing it down here into a couple of bullet points - I'm going to give you the benefit of what I learned in a new article tomorrow called - "StorageSearch talks SSD with Holly Frost."

Going back to my headline - will new RamSan rattle Violin?
tier 1 - 1U rackmount SSD
no single point of failure
lowest latency, highest density 1U FC SLC SSD
the RamSan-720 - from Texas Memory Systems
fibre channel compatible SSDs I'm sure that Violin would say that this simply validates what they are doing (and shipping) already - and that the enterprise SSD market is big enough for all vendors in this category to keep growing at a healthy clip. It make you wonder how much a company like TMS might be worth too...
why do people need a new SSD like this one?
Dan Scheel said that in the past - customers had used the company's SSDs to accelerate a single function in their company.

SSDs had been expensive - and fast SSDs were used in apps where the raw speed translated directly to a a business benefit like greater revenue. Managing the fault tolerant architecture in these systems was done using a variety of (slow) methods - but it was more important to keep the apps running fast - when running - than to guarantee that fail-over would be just as fast. If there was a failure - and a reduction in speed while failover occurred - the impact was to a single isolated business function.

The high availability features in the RamSan-720 means it can go into any kind of application that a customer might have - and because it is reliable (with overkill HA) - the customer didn't have to worry about what jobs the SSD was used on. It would be an economic choice which could be used for a wide range of functions which required high performance.

So it was a new market for SSDs - going into places which had traditionally used tier 1 HDD storage.

But now the SSDs were just as simple to deploy, faster, cheaper and used less power and less physical space.
pricing strategy
In the past TMS was viewed as a supplier of the fastest SSDs - but also the most expensive as well.

I said the pricing of the RamSan-720 looked aggressive compared to other modern SSDs (not just dinosaur HDD systems).

Holly Frost said that because they're using FPGAs to replace processors. memory etc found in some other systems based around servers (I guess a prime example would be the server based Kaminario) that means most of the recurring system cost is the cost of the memory chips. TMS's varaiable stripe and big architecture gives them advantages here too - because they don't need to start with so much raw capacity to deliver usable protected capacity. (Despite the fact they also include hot spares too - which I forgot to mention above.)

Anyway - the end result is - a product that's smaller, uses less power and costs less to make. So they plan to use their cost advantages to be more competitive.
what's happening with RAM SSDs?
TMS is still selling them - and when they look at the numbers - they are still holding up.

A year ago TMS told me that flash SSDs were then 70% of their business - and nowadays flash is where the growth is.

Holly Frost said that's why most of their product announcements in recent years had bee n flash based SSDs.

But he agreed that the RAM SSD market would never go completely away.

There were still some apps where having the lowest latency mattered - and customers could still get faster read latency through a rackmount RAM SSD than from a bus based flash SSD.

Spook stuff? - I said.

Holly Frost said - "I could tell you but then I'd have to kill you."
how much is Texas Memory Systems worth?
This has been a year floodlit by SSD company IPOs and acquisitions - and there's been a lot of discussion in some quarters about how much any particular SSD company might be worth.

I revealed to TMS that I had several years ago been asked how much I thought TMS would cost to acquire - and at the time I told the reader who asked me I doubted if TMS would be interested.

"The difficult thing today" - I said "would be knowing what value to place on an SSD company that isn't owned by VCs and which might actually be making some money?"

That got a laugh but wasn't what they wanted to talk to me about. So I didn't pursue it. But I can picture some of my institutional readers right now - setting up spreadsheets which add a fraction of Violin to another fraction of Fusion-io. I'm guessing that's how these sums are done.

...Later:- within a year of this interview TMS did in fact join the list of acquired SSD companies - having been bought by IBM
the SSD market is an exciting place to be working right now
Holly Frost said the thing he likes about the SSD market is that it's not a boring subject.

He admires the ingenuity of competitors - and the different solutions that SSD companies have created to solve similar applications problems.

There are so many interesting new developments being done in the industry with new architectures and technologies that he can't imagine working anywhere else.

I think that's a sentiment with which I for one - and many of you too - can easily agree.
