|
"RAID
was one of the unsung heroes of the digital economy in the past 20 years.
Without RAID - we wouldn't have the big data (and reliability) that we have
gotten used to today. But don't get too hung up on learning the internal
details of traditional RAID. With today's fast links and processors there are
many other new ways to stripe data across drives and memory chips too - with
lower latency and better efficiency." -
Zsolt Kerekes,
editor |
RAID is
an acronym for
Redundant Array of Inexpensive Disks. |
In the mid 1980's when
this term first entered public awareness, you could buy 2 types of
disk drives, either low
cost drives such as used in the average PC, or high speed high performance
mainframe drives as used by Quantel
in its digital video effects systems.
The huge market for PC disks soon became the leading edge technology
drivers for disk storage and overtook the larger minicomputer and mainframe
form factor disks in speed,
reliability and
capacity.
By the late 1990s RAID systems using PC form factor disks
had become the most common form of bulk storage in enterprise servers and even
some (Unix) mainframes.
Today the original RAID concept remains valid even though hard disks
have changed form factors many times in the past 20 years (8", 5.25",
3.5", 2.5", 1.8" and below 1") and the concept may be useful
in the near future when the "disks" in the array could actually be
flash solid state
disks and not traditional
hard disks.
You
can create a virtual disk array which looks electronically just like a bigger
ordinary disk, by attaching a bunch of disks working in parallel and connected
to a
RAID controller
interface.
The combined system can be programmed to provide desirable
characteristics such as faster data throughput
(for example a 4 disk wide system could have a data throughput capability 4
times faster than a single disk).
RAID can also provide
fault
tolerance, because redundant disks can be added into the array and the
data split up in such a way with redundant error bits that there is no loss of
data if any single disk fails (or if 2 disks fail in some RAID
configurations) - provided the dead disk(s) is replaced and the data rebuilt
before the next failure occurs.
RAID doesn't always result in an
application speedup. It can slow down the access time in some types of
application in which the data sets are small and randomly located - because the
latency of the RAID controller is additional to the disk's own access time. ...from
Megabyte's Storage
Dictionary | |
| . |
An SSD ASAP is something
which you install between 2 storage systems which have vastly different latency
and capacity.
Historically the slow end was legacy RAID built from arrays of
rotating rust platters.
But in more modern systems both ends can
be SSDs. |
Need SSD Acceleration
ASAP? Say "hello" to SSD ASAPs | | |
. |
| Nibbles Re:
RAID History |
IBM received the first patent
for a disk array subsystem in
1978, and
co-sponsored the research by the University of California at Berkeley that led
to the initial definition of RAID levels in 1987.
IBM launched the
first modern style RAID systems in
1990
It wasn't until the late 1990's that RAID technology became a "must-have"
building block in commercial Unix servers.
In the future RAID
technology will be already integrated in most home PC's and entertainment
systems, because home users don't do backups, but they will have large digital
entertainment libraries which won't fit neatly onto a single disk. | |
|
| . |
|
Nibble:- RAID and Me
I first came
across the concept of RAID in 1986 when I
joined a company
called Databasix as their hardware engineering manager.
Databasix,
based in Newbury in the UK, was founded by John Golding from the
nucleus of an earlier company he had been in called Microconsultants.
Microconsultants
had spawned rich sister company Quantel.
Peter Michael, who had been the business and technical brains in both these
earlier companies, gave Databasix a very generous start in life from what I
could see by the pool of talented people and money which had mostly been already
spent when I joined the 70 people or so in the new "start-up".
Among
the many things on my to-do list was to build a working RAID controller and RAID
array demonstration system.
"I don't know much about hard drives"
- I said. I could afford to be honest - because these guys had already seen the
worst when they first met me. My VC backed networked data acquisition company
was going bust and they had been a potential buyer.
"There's not
much too it." My boss said. "Just read the manuals that come with the
disk drives. We want to see if RAID will give us fast real-time disks at a cost
that's significantly less than the video disks from Japan used by our sister
company Quantel."
"OK" I said. "I don't know much
about RAID either."
"Nobody does. Here's a bunch of
articles. They tell you all you need to know. We'd like the demonstration ready
in 3 weeks."
"That sounds like a very short time to me."
"We've
already ordered the disks to save time. You order whatever chips you need, and
use some of the software guys to help on this."
From memory, I
think I got the demo deadline pushed out to about 4 to 5 weeks.
We
also had an Artificial Intelligence demo being worked on at the same time by
about 50 software engineers, and a parallel computing demo, but the RAID
functionality was the "must have" thing which could not be easily
dropped from the sales plan.
We did build a working 4 drive RAID
system. One of our biggest problems had been the high rate of Dead On Arrival
disk drives. That caused a lot of problems which we initially blamed on the
software. Hard drives were a lot more sensitive in those days and could be
killed just by putting them down on your desk.
But by then I knew a
lot more about disks and realised that the Inexpensive Disks we used in our
demonstrator weren't anywhere near as fast as they could have been, because they
were the wrong standard. Then, as now, there were many interface standards for
disk drives. If you're going to build a fast system then you might as well use
fast building blocks. Dataquest was telling me that we should probably be using
SCSI instead.
As my imposed wish lists started to pile up and
commercial reality started hitting my new employer, I decided that it would be a
heck of lot easier to partner with a disk controller company which was already
down this part of the curve, and later we became a beta site for dozens of
manufacturers of processor cards, array processors, hardware interfaces, memory
and disk drives as we tried to make a business out of selling the technology
curve to military buyers almost before it was really there. That was great fun,
but a different story.
It was about 10 years after that before RAID
systems next appeared in my life - when RAID companies like DEC (acquired by
Compaq and now part of HP) and Data General (acquired by EMC) started promoting
their RAID systems to readers of my Sun foused
SPARC Directory.
Nearly 12 years after my first acquaintance with RAID it became one
of the first 4 product categories here on STORAGEsearch.com. And although the
interface patterns have changed over the years, from DAS SCSI, then
Fibre-channel SAN, then Ethernet NAS, and then iSCSI, the ideas inside the box
have remained the same.
When you get to be an old guy like me, it's a
lot easier if some of the new stuff which hits your brain, is actually a rehash
of old stuff.
...more about Databasix.
I didn't want to interrupt
the narrative flow above - but the core Directors at Databasix when I joined
were:- John Golding, Andrew Bruce, Ray Potter and Dan Boxshall.
| | |
..... |
"New SSD enhanced hardware and software fabrics will have the same
effect on how you come to view a single server - as RAID did on the limitations
of a single hard drive." above quote from -
the Top SSD Companies
- 2014 Q3
See also:-
after AFAs -
what's the next box? (2017)
And for a philosphical detour (covering
wider ground than the narrow intersection of RAID with SSDs) - how about
looking at the
enterprise flash story... could it have been simplified? |
| ..... |
| RAID, RAIC,
RAISE etc - RAID's Many Aliases |
The long established word "RAID"
has been stretched by marketers into many other disguises to make their
products sound better.
The RAID system concept itself is simple, being
a box of disk drives with self healing properties which can be run in parallel
for faster data throughput. But that's not sexy enough for most storage product
marketers. So we now have the following refinements.
- RAID connected by SCSI, Firewire or IDE can be called a
DAS (Directly Attached Storage). The RAID has to be connected by
something, but "DAS" sounds more modern, and indicates that you chose
this method of connection in preference to all the others.
- RAID connected by Fibre-channel can be called a SAN (Storage Area
Network). That sounds better already.
- RAID connected by Ethernet can be called a NAS (Network Attached
Storage). Latterly the term IP-SAN has been used to add new freshness to
this word-washing.
- RAID connected by both Fibre-channel and Ethernet can be called a NUS
(Network Unified Storage).
- Even better than plain old vanilla NUS,apparently, is
SUS, or Scalable Unified Storage, coined by the short lived startup
Broadband Storage
- More likely to endure than either NUS or SUS, is market research company
Gartner's term FAS
for Fabric Attached Storage which also lumps NAS and SAN together.
- LI-RAID (Layer-Interleaved RAID) is a concept for 3D nand flash
which exploits the different quality of blocks. Revealed in a
research
paper presented June 2018.
- RAIN (Redundant Array of Independent Nodes) is an
Adaptec creation
- MAID (Massive Array of Idle Disks) is a whimsical term from
COPAN Systems used for
disk to disk backup systems.
- RAIGE - (RAID Across Independent Gigabit
Ethernet) is a creation of Pivot3
although some of concepts sound similar to how
Google implements its
internal storage infrastructure.
- RAISE - (Redundant Array of Independent Silicon Elements) is a
term invented by SandForce.
Their SSD controller uses this protection scheme inside
1.8" and
2.5" SSDs.
- RAIC - redundant array of independent chips - was coined in the
1990s by Solid Data
Systems
- DVRAID - is a proprietary RAID technology from
ATTO Technology that is "optimized
for digital content creation environments that require protection in the event
of a disk failure without the performance penalty traditionally seen with parity
RAID."
- SAID - Self-maintaining Array of Identical Disks - a possibly
overambitious term from Atrato
- RAIDn
- is an algorithm launched in 2003 by InoStor. It never achieved wide
currency.
- Finally, a RAID not connected to anything at all
can be called a
LUS (Lonely Unloved Storage)...
No, I just made this one up.
But you can see the basic principle at work here. And no doubt there will be
other terms later for RAID connected by the Internet or Infiniband. See also:- Megabyte's
Storage Glossary which
includes definitions of the many other strange terms which appear from time to
time in these pages. | | |
| ..... |
iSCSI /
NAS /
SAN SATA Raids the
Datacenter (2004) A
Storage Architecture Guide 0 to 60 -
choosing the right RAID What do
enterprise SSD users want? RAID
Levels Outlive Their Usefulness (2001) Where are we now
with SSD software? How fast can your SSD
run backwards? Which
RAID Manufacturers will Survive? (2000) Market Trends in the
Rackmount SSD Market 10 Ten Tips for a
Successful RAID Implementation
RAID
- editor mentions on StorageSearch.com RAID
- what it is, why we need it and how it works (pdf) Using Solid State Disks
to Boost Legacy RAID Performance (2004) |
RAID news |
new thinking in SSD
controller techniques reveals "layer aware" properties exploitable in
3D nand flash
Editor:- August 28, 2018 - A new twist using
RAID ideas in
SSD controllers has
surfaced recently in a research paper -
Improving
3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process
Variation (pdf) by Yixin Luo and Saugata Ghose (Carnegie Mellon
University), Yu Cai (SK Hynix), Erich F. Haratsch (Seagate Technology) and
Onur Mutlu (ETH Zürich) - which was presented at the SIGMETRICS
conference in June 2018.
The authors say that in tall 3D nand (30 layers and upwards) the raw
error rate in blocks in the middle layers are significantly worse (6x) compared
to the top layer. Therefore to enable more
reliable and
faster SSDs using 3D nand for enterprise applications they propose a new type
of RAID which pairs together the best predicted half of a RAID word with the
worst predicted half from another chip in the same SSD.
This new RAID
concept starts to be feasible in a very small population of chips - unlike
traditional 2D nand schemes which need more chips to be installed in the SSD.
The
new RAID is called Layer-Interleaved RAID (LI-RAID) - which the authors
say "improves reliability by changing how pages are grouped under the RAID
error recovery technique. LI-RAID uses information about layer-to-layer process
variation to reduce the likelihood that the RAID recovery of a group could fail
significantly earlier during the flash lifetime than the recovery of other
groups." ...
read the article (pdf)
Editor's comments:- the new RAID is
just one of many gems in this research paper. Others being the discovery that
remanence in 3D nand includes a significant short term charge loss (in the first
few minutes after writes), and also that an endurance based characterization of
a small part of each chip can be used to predict an optimized layer dependent
threshold read voltage for all the layers in the chip. I've
discussed
the significance of adding the concept of "layers" to "number
of raw chips" to the thinking in SSD controller design in my recent
home
page blog.
Seagate will acquire Dot Hill
Editor:- August 18,
2015 - Seagate
today
announced
it will acquire Dot
Hill Systems in an all-cash transaction valued at $9.75 per share, or a
total of approximately $694 million.
SuperCloud rebuilds RAID 20x faster with CoreRise PCIe SSD
Editor:-
July 3, 2015 - CoreRise
today noted some
record breaking performance results from one of its customers -
SuperCloud (a well known Chinese
cloud server manufacture) based on a configuration with
CoreRise's PCIe
SSDs in a 4U server with 2x 56Gbs
InfiniBand ports.
Among other things SuperCloud said its lab results showed that
RAID rebuilding was 20x
faster than without the SSD - using a RAID5 configuration of 6D+1P. While RAID
throughput was 10 to 14GB/s and 1 to 1.5 million 4KB
IOPS.
Nimble video discusses 5 9's in 5,000 systems
Editor:-
February 21, 2015 - Nimble
Storage recently disclosed (in a
sponsored video fronted
by ESG) that its
customer deployed rackmount storage systems are achieving better than
5
9's uptime - 99.999%
availability.
This
has been attained in a field population of 5,000 arrays representing 1,750
years of system run time thanks to a combination of factors including the crowd
sourced intelligence of its
InfoSight
management system which can alert users to potential down time events so they
can take evasive action before bad things happen.
Editor's
comments:- While useful in telling us how many systems Nimble has sold it's
less useful as an indicator of availability given that the average run time
across the population is about 4 months.
It would be more impressive
if they could repeat the disclosure in a few years time and selectively extract
the up-time of systems over different run times, upto 1 year, 1 to 2 years etc.
If indeed Nimble is still in a position to do so, and if it would
still meaningful given the
consolidation
in hardware and software which lies ahead for the enterprise SSD market may
mean that vendors will be using the same hardware.
InnoDisk spins off rackmount SSD business into AccelStor
Editor:-
November 27, 2014 - InnoDisk
today spun off the business unit associated with developing and marketing its
FlexiArray systems (1U InfiniBand / iSCSI 10GbE enterprise flash arrays) into a
new company called AccelStor.
Dell uses Avago's 12Gb/s SAS chips in new RAID systems
Editor:-
September 10, 2014 - Avago Technologies
today
announced
that Dell has
selected Avago's 12Gb/s SAS technology (recently acquired from
LSI) for use in
RAID controllers in Dell's new PowerEdge Servers. See also:-
SAS SSDs,
storage glue chips
rackmount SSDs - new reports from Evaluator Group
Editor:-
September 24, 2014 - Evaluator Group
today
announced
it's expanding its comparison report coverage (priced from around $2,750 for
IT end-users) related to
rackmount SSD and
hybrid array
vendors.
The latest addition to EV's research area are product
analyses for 15 vendors, including:
Cisco,
EMC,
HDS,
HP,
IBM,
Kaminario,
NetApp,
Nimble,
Nimbus,
Pure Storage,
SanDisk,
SolidFire,
Tegile,
Tintri and
Violin.
"Over
the next 3 years Evaluator Group expects Solid State Storage Systems to be the
architecture adopted for primary storage," said Camberley Bates,
Managing Partner & Analyst at Evaluator Group. "Performance to reduce
latency and improve consistency,
along with reliability
and efficiency
functionality
will drive this change. It is important IT end users understand the trade-offs
of design and technical implementation to best suit their needs."
Using
the
Solid
State Evaluation Guide to understand the critical technology characteristics
EV says IT end users can clearly identify their requirements and priorities. The
Solid State Comparison Matrix allows for side-by-side comparison of product
specifications and capabilities. Evaluator Group guides IT end users through the
process with product reviews and expertise on managing and conducting a Proof of
Concept. Evaluator Group Solid State Storage Systems coverage includes products
specifically designed to exploit the characteristics of all solid state
deployment.
What will you be getting?
EV is offering a
free
evaluation copy of their report for the IBM FlashSystem to people who
sign up for it.
Editor's comments:- with so many different
architectural roles
for enterprise SSDs and
different
user preferences - it's unrealistic to suppose that any simple side by side
product comparisons will suit all permutations of user needs. But having said
that - any reliable information which assists
user education and
comprehension into SSD arrays is a good thing.
Some flash array
vendors - realizing the futility of expecting that users will understand what
their products do and how they will interact with the
bottlenecks
and demands of user installations and workloads - have instead opted to
side-step these delay laden hard user selection quandries -
which are exaggerated by
the concerns of getting it wrong - by instead offering new risk
delineated pricing models - as described in my article -
Exiting
the Astrological Age of Enterprise SSD Pricing.
See also:-
storage market research,
90%
of enterprise SSD companies have no good reasons to survive
Old 3ware RAID 10 kills Intel SSD array performance
Editor:-
February 11, 2013 - It's not really news that anything which wastes latency and
raw CPU performance in the host interface and related controllers - also
prevents and wastes the possible performance gain from SSDs. But sometimes
users are the victim of circumstances outside their control.
This
point is vividly demonstrated in the article -
How
slow can SSD be or why is testing a new server performance important? -
which describes the problems someone had when using SSDs in an array with a
slow controller designed for hard drives which was the default offered by
their ISP company.
Amplidata gets $8 million C funding for - (don't call it RAID)
Editor:- February 29, 2012 - apparently you can still raise
VC funding for
HDD based storage array
companies.
Amplidata
today said it got
$8
million in a Series C round of funding for its multi-petabyte scalable,
efficiently data protecting and healing
BitDynamics technology.
The
idea is to implement the cheapest possible and easily expandable bulk
storage capacity - what they call
unbreakable
object storage for petabyte-scale unstructured data (pdf) - so the storage
nodes are stuffed with low power hard drives. But the controller racks can be
internally accelerated by SSDs.
SSDs accelerate disaster recovery
Editor:- September
24, 2010 - an update to Intel's SSD Bookmarks
- published today on StorageSearch.com
- includes links to a case study in which RAID rebuild times for a real-time
education server were reduced from 12 hours to 40 minutes, while response times
became 25x faster. ...read the articles | |
|
|
|
| ..................................................................................................................... |
| |
..... |
 |
Megabyte
found that tying lots of barrels together to cross the data stream worked
well. And if one of them got punctured, the raft didn't sink. | |
| .. |
In 1998
StorageSearch.com's RAID page (this page you're seeing now) listed 51
manufacturers.
Later that grew to hundreds. Although most of them have
gone away.
In
2009 I said that the modern era of SSDs had signed the death warrant for
hardware RAID controllers.
The cutting edge of storage performance hardware controller wizardry
moved from SAN to PCIe
SSDs (and fabrics) then lingered a while in the
memory channel
and everthing will change again due to new
memory
defined software.
See also:-
SSD controllers and IP | | |
| ..... |
 |
| ..... |
| SSDs in
RAID's past and future |
SSDs have been used in
RAID configurations since the
1990s
when all enterprise SSDs were RAM SSDs.
In the early phases of the
enterprise flash market a
seminal
white paper (pdf) (2007) based on lab measurements showed that SSD
IOPS (and
throughput) could
scale
in arrays - using traditional RAID controllers designed for HDDs.
And
that was an approach taken by many storage vendors to ease their way into the
emerging rackmount SSD market.
Disadvantages of that technique,
however, were that system latency was wasted by slow controllers and also
the extra parity writes in fault tolerant RAID schemes worsened the reliability
of the flash arrays by placing greater burdens on
endurance.
In
more modern products - vendors have adopted new approaches for arrays of flash
which take better cognisance of flash's characteristics. For example:-
- big
controller architecture
- lower latency RAID management
- write amplification reduction (by software stacks, new types of RAID
algorithm and use of big front end RAM caches)
| | |
| .. |
 |
| .. |
| "The performance
impact from RAID rebuilds becomes compounded with long rebuild times incurred by
mutli-terabyte drives. Since traditional RAID rebuilds entirely into a new spare
drive, there is a massive bottleneck of the write speed of that single drive
combined with the read bottleneck of the few other drives in the RAID set." |
| Dave Wright,
CEO - SolidFire
- in his blog -
Say farewell
to RAID storage (March 2013) | | |
| .. |
| "When it comes to
hard drives RAID was better than what it replaced in the 1980s but if you
started again with the internet connectivity and processors we have today and
started to design big arrays of disks from first principles then you wouldn't
see just the RAID systems you see today. That's because RAID is small
controller architecture..." |
| Editor:-
in the article - Why
size matters in SSD controller architecture (June 2011) | | |
| ..... |
| "What we need to
do now is recognize that in very large systems, with lots of components
failures are no longer a rare anomalous event - they are a frequent normal
condition - and we need to quote performance and availability and
reliability assuming that there is always some repair going on and we need to
acknowledge that all these redundancy schemes reduce the probability of data
loss they don't eliminate data loss..." |
| Garth Gibson,
founder and chief scientist,
Panasas
Long Live RAID
(video) (May 2012) | | |
| . |
| "The
misapplication of consumer grade SSD technology to enterprise applications was
once a real problem for the reputation of the SSD market. One organization
had installed RAID systems using Intel SSDs in a high performance environment.
About half the SSDs had "burned out" after a year. Worse than that -
when the customer investigated more closely they found that some SSDs had failed
in a way which had not been detected by the RAID controllers." |
| Fusion-io's CEO
talks SSD - with StorageSearch.com (2010) | | |
| .. |
| "...Cached RAID
solutions are starting to run out of gas in high-performance,
transaction-intensive applications: while the performance demands on e-business
infrastructures are being driven to new levels by exploding demand,
unpredictable peak loads, and an increasingly impatient population of on-line
customers. It's time for another architectural innovation." |
| Michael Casey, Solid Data Systems
(November 2000) - in his article -
Solid State
File-Caching for Performance and Scalability | | |
| . |
|
|
| . |
are you looking for an elusive shiny needle
- in an over towering RAID haystack?
My lists of RAID
systems vendors started with just a handful of companies in 1991 and grew to
hundreds of companies.
When lists are that large they are boring. And
RAID is now a commodity. So I've zapped the RAID oems list which used to be on
this page.
Instead - you'll get better results using site search -
with whatever is your favorite search-engine.
| | | |