SCM DIMM wars - article by StorageSearch.com


leading the way to the new storage frontier	.....


what's RAM really?	...

SSD news
hybrid DIMMs
memory channel SSDs
memory defined software - is that a new thing?
the importance of being earnest about 3DXPoint
are you ready to rethink enterprise DRAM architecture?
miscellaneous consequences of the 2017 memory shortages
where are we heading with memory intensive systems and software?
latency reasons for fading out DRAM in the virtual memory slider mix

.....

DIMM wars in SSD servers

how significant is Diablo's Memory1
for the enterprise data ecosystem?

by Zsolt Kerekes, editor - StorageSearch.com - August 13, 2015

Diablo has launched an aggressive assault on the enterprise server RAM market with the launch of a new DRAM compatible emulation memory module (upto 256GB each) called "Memory1" (faqs) which can functionally and competitively replace upto 90% of DDR-4 DRAM - with higher density, lower cost flash.

Here's some of what Diablo said in its press release:-

"The revolutionary technology packs 4x the capacity of the largest DRAM modules, delivering greater capability on fewer servers and lowering datacenter costs by up to 70%.

"The same system memory slots that now hold 128 or 384 gigabytes of DRAM memory can house up to 4TB of Memory1 and process data-intensive applications that were previously beyond reach. These efficiencies are driven by Diablos breakthrough in memory design, which replaces expensive DRAM with low cost, high capacity flash, while delivering the performance an enterprise demands.

"Memory1 is deployed into standard DDR4 DIMM slots and is compatible with standard motherboards, servers, operating systems and applications. Memory1 is shipping now to select customers and will be broadly available this fall."

"Memory1 brings the low cost and high capacity of flash to large-scale enterprise and datacenter customers. It is ideal for environments that require large memory footprints per server for workloads, such as big data analytics and complex Web applications. The average Memory1 use case enables a 4 to 1 server reduction, and one customer use case requires 90% fewer servers. Diablo will initially focus on delivering Memory1 to cloud and hyperscale datacenters, which stand to see significant economic benefits because of their scale."

Editor's comments:- I spoke to Diablo about Memory1 in July - but I wasn't able to write about it before due to their launch date which was while I was on vacation.

But as I said to Diablo - when you've got a technology story like this - likely to have a big impact on SSD history - a few days or weeks delay in the retelling of the story don't really matter.

My gut feel assessment is this.

Diablo's Memory1 is the most significant development in the enterprise flash market in the past 3 years (since the widespread impact of adaptive R/W and DSP controller technology) and Diablo's Memory1 is the most significant development in the enterprise application server market in the past 8 years (since the migration of most enterprise SSD installed capacity to flash away from RAM and - in particular - the launch of the first flash based PCIe SSDs by Fusion-io in September 2007).

But let me clarify something important.

Memory1 is not an SSD.

Memory1 is a volatile memory which replaces DRAM with a flash array.

You'll still need some real DRAM in your system. Diablo currently recommends a ratio of no more than 10 to 1 (Memory1 to RAM). And there are OS limitations too. Currently support is limited to Linux although there are plans to support other major platforms too.

the big question - how does flash replace RAM?

Diablo didn't want to say much about how they use the flash.

Diablo told me - "We are not disclosing the flash technology, only that we are using widely available NAND flash. We employ several technologies to ensure endurance, an example is that we deliver volatile storage, which is less taxing on the media than persistent storage."

Diablo declined to reveal how much overprovisioning they use inside the flash array. But they did answer my question about warranty - 3 years.

A year later I learned the answers and published them here - SSD aspects of Diablo's Memory1 and DMX - which explains how their DRAM controller and flash controller can reliably replace DRAM with a conventionally rated DWPD flash. ...read the article

This bring us back to the question.

what is Enterprise DRAM?

At a top level - DRAM - seen from the applications server point of view - has complex and changing latency characteristics.

This is because traditionally it is the sum effect of a bunch of different latencies (as experienced by the software) due to a hierarchy of latencies which include:-

memory caches on the CPU chip,

caches external to the CPU chips,

real DRAM (the kind of modules you see plugged into the motherboard) and

emulated DRAM (the virtual memory which gets swapped in and out of HDDs and SSDs).

You've known all that stuff since forever. But changes have been coming.

In the past year or so in articles which I have collected in are you ready to rethink enterprise DRAM architecture? - it's been clear that researchers have been evaluating the memory and server experience and have realized that - depending on the application software - it's possible to get similar performance effects to traditional DRAM and CPU sets in other ways.

The 2 challenges for the DRAM market come from 2 factors:-

DRAM latency at the hardware level (as seen by local CPUs) is worse than it was in the year 2000.

There are many reasons for this - the chief being that enterprise memory systems have been optimized for density and cost rather than speed. And the growing complexity of multi processor cache designs means that worse case latencies are much longer than you might think. See more here - latency loving reasons for fading out DRAM in the virtual memory slider mix.

DRAM latency - as seen by applications - is really the "virtual memory" latency which arises from the cumulative impact of several different system behaviors. These include upstream storage, cache properties and the software stacks. In effect all legacy applications have been "trained" during the past decade or so to operate with a DRAM latency which has statistical latency properties which are good enough rather than guaranteed.

During the first 5 years or so of the PCIe SSD market it became clear that making storage latency faster (than the HDD models assumed in rotating storage architecture) could enable trade offs in memory size in which DRAM could be replaced by flash within certain operating bands and achieve similar application performance at much lower operating costs.

Memory1 operates in the DRAM interface so theoretically can provide better performance than PCIe SSD based flash memory emulators.

What we've got here is a new memory type - which will not only take business away from traditional DRAM and CPUs (as a special case extrapolation of the CPU-SSD equivalency reasons which I told you about in 2003) but Memory1 will also take business away from that 3D PCM memory which Intel and Micron have been talking about too - for the simple reason that flash is a cheap and more mature technology.

This new technology from Diablo is more significant than what we had been led to expect from their previews a year ago.

It doesn't replace memory channel SSDs or hybrid DIMMs - it's a new product type - flash as RAM - which could lead to a market as significant in revenue as the whole PCIe SSD market.

I'm guessing then that Diablo will stay the #1 most researched company in the SSD ecosystem for a while as a result.

Competitors to Diablo's Memory1?

As I said above - many other SSD companies have been eyeing exactly the same applications problems which Diablo's Memory1 is designed to satisfy:- lowering the cost of server populations which rely on access to extreme amounts of low latency RAM in order to fulfill their business purposes.

(...Later:- You can see a growing double digit list of competitors in the sidebar article on the right.)

While there are currently no exact form factor matches to Memory1 which give similar memory density - there are other ways to get similar results. And if you regard the server box as the socket footprint (which it really is - if you're planning to buy 10K or 100K servers) then DIMM socket compatibility - while convenient - is not essential.

My preliminary list of potential hot competitors in the Diablo Memory1 market space goes like this.

Flash based solutions.

In the current state of the memory market - nand flash is the cost leader. Alternative approaches include:-

software which remaps big RAM memory into flash SSDs.

For example:- SanDisk's ZetaScale. This operates with a wide range of SSD interfaces - so users have a lot of freedom to optimize their latency / cost profile. At the extreme limits Memory1 may be faster in latency terms than most PCIe SSDs - but my guess is that some kind of future ioDrive like product (possibly optimized to strip away non volatility) would give very similar performance.

Other SSD software companies which until recently have been in stealth mode are also coming into the latency lowering flash market.

in-situ processing - combined with PCIe SSD platform.

Companies which have been talking about this since last year include:- NxGn Data, Memblaze and others. And earlier this year further validation of the concept came from researchers at MIT who showed that flash can come close to RAM rich server performance by using clunky app-specific FPGAs combined with flash.

Netlist? - This company which was embroiled for over a year in legal actions aimed at Diablo's first product - the MCS SSDs - is one of several in the hybrid DIMM market which might make a future pitch in this market. Having said that there's a world of difference between these DIMM compatible RAM SSDs and Memory1 - where the RAM which is used is actually in another socket supplied by someone else. Other companies in this category include Micron. But due to the market risks of shooting its other products in the foot I think Micron will prefer to position its response to this application in different ways first (with higher priced solutions).

non flash / nvm solutions

These will probably cost more than the flash based competitors.

Already mentioned above - high density 3D PCM from Intel and Micron. Details at the present time are sketchy. But it's reasonable to assume this will initially be offered at much higher cost than flash solutions - because it's a new technology and unlike Memory1 - a key characteristic it offers is non volatility.

fabric / flash / nvm alternatives

Another contender which converges with all the technologies above is PMC whose NVM Express over RDMA includes elements of PCIe fabric but interoperate the company's low density RAM SSD and flash SSD products.

Together these can be combined to build low cost low latency fabrics for small server clusters at a starting price point probably below PLX and much lower (scale) than A3CUBE.

Currently these fabric solutions enable users to extend the DRAM space at low microsecond latencies outside the rack rather enabling more RAM in the same rack. But I think they could also be viable technology launchpads for future PCIe connected flash memory similar to Diablo's Memory1.

how fast can your SSD run backwards?
aspects of SSD design - processors used in SSDs
SSDserver rank - everything you need to know in a single number

....

no one owns the SSD DIMM wars market

Although Diablo was the first to ship high capacity products in in the DIMM SSD / SCM (storage class memory) accelerator market - delays due to legal wranglings delayed adoption by over a year - which enabled competitors to grab attention for "soon to be" similar alternatives. By the close of 2015 - 9 companies had announced significant product plans in this market.

By Q3 2016 - over 20 companies were known to be working on DIMM wars related products and software.

....

the road to DIMM wars

DIMM wars in SSD servers is closely related to several multi-year technology themes in the SSD accelerator market.

From an SSD history perspective it can be viewed as being the successor to 2 earlier centers of focus in the modern era of SSDs which can be summarized in these broad periods:-

1999 to 2007 - the dominant focus and center of gravity was FC SAN SSD accelerators.

That was the pragmatic place for SSD accelerator leaders to focus - because it was relatively easy to integrate in legacy installations which had been conceived by users with pre-SSD era thinking.

Furthermore FC SAN users - as a market - were representatitive of a user base which was focused on performance and which had more money and technical resources to engineer systems which were performance oriented than other segments in the storage market.

That period was especially complex for users because it overlapped with the transition within the enterprise SSD market from dominance by RAM SSDs to flash based SSDs - which introduced many technology signals to the market which were easily capable of being misinterpreted by vendors and users alike - due to the fact that in the previous 2 decades there had been relatively few disruptive technology changes in the enterprise server market.

And further complicating factors were that FC SAN SSD accelerators were being marketed by small vendors directly to SSD illiterate end users and bypassing the existing hostile distribution channels of the encumbant SSD ignorant server and storage heavyweights.

And all this was being done before the existance of an SSD software market - which would later automate and simplify SSD integration and therefore accelerate sales.

2007 to 2014 - the dominant focus and center of gravity was PCIe SSD accelerators.

It was the rapid and domino effect-like adoption of 3rd party PCIe SSD accelerators by every mainstream server manufacturer - which had been predicted years earlier as an inevitable consequence of CPU-SSD equivalency - which changed the way that SSDs were perceived from being an alienlike technology promoted by industry outsiders to being a necessary option which was expected to be supported inside every new server product line.

What we've been seeing in the enterprise PCIe SSD market in recent years is that most vendors are focusing on making these products more affordable rather than trying to push the limits of performance.

In the performance enhancement region of the market - the emerging changes have been towards getting better application performance by better integration with software rather than speedier flash or interface speeds.

These efforts include:- application optimized stacks, incremental standards such as NVMe, 3rd generation controller architecture (multi-level partitioned flash movements) and in-situ processing.

2014 - 2018 - the new dominant focus and center of gravity in large memory systems will be DIMM wars - which revisit long held assumptions such as - what is memory? - where's the best place for it to go? - and how much memory of each latency is best to have in each location? (Given that multiple distinct memory and storage latencies have found viable and enduring economic and architectural product classication silos already.)

These questions are being asked due to growing confidence from the SSD market experience that application architectures and the design of data processing engines are not (as once seemed) set in stone. They can be changed - if it makes sense from a business perspective. And although the investment costs are huge - the enterprise market has already learned to adapt to many changes caused by flash and to many adptations within the evolving flash architecture experience itself. So that opens the door for other memory types too.

The territory for future DIMM wars includes already established emerging technologies (memory channel SSDs, NV DIMMs / hybrid DIMMs) but also new products such as ultra high capacity DIMMs, nonflash NVDIMMs, memory fabric expansion (by PCIe and IB) and probably big memory fabricated onboard CPU chips (as an alternative to simply adding more DRAM cache or more cores).

A big incentive for server makers to engage in DIMM SSD wars is that integration with motherboard resources and firmware promises better lock-in at the customer level because traditionally end users have been less willing to exercise their freedoms to choose 3rd party competitive suppliers for DIMM components.

It may be that the DIMM SSD could become a stickier component for server makers than PCIe SSDs. It's certainly the case that many of the early generations of DIMM enhanced SSD products have required special cables or power or bios customizations by server makers to support their operation.

DIMM wars in SSD servers

how significant is Diablo's Memory1 for the enterprise data ecosystem?

how significant is Diablo's Memory1
for the enterprise data ecosystem?