|
ECC Technologies, Inc. was founded and incorporated in 1990
to provide ECC (error correction) technology to electronic storage and
communications industries worldwide, The company is located in Minnetonka,
Minnesota, a suburb of the Twin Cities of Minneapolis-St. Paul.
We supply our customers with solutions not only to conventional error
correction needs, but also to the challenging needs of high-speed,
wide-bandwidth, fault-tolerant systems. These solutions can be in the form of
hardware or software designs, all customizable to fit customer's applications.
Other services we provide are assistance with error correction need analysis,
assistance with testing, and support throughout the implementation of a design.
ECC Technologies pioneered byte-parallel Reed-Solomon error correction
(PRS ECC) technology and was granted a patent on it May 19, 1998. The company
also pioneered two PRS RAID schemes, RAID X and RAID Y, which are capable of
emulating all other RAID schemes.
The company's founder, Mr. Philip E. White, has more than 20 years
experience in successfully applying advanced coding theory concepts to the
design of data storage and transmission products. He began his study of error
correcting codes in 1973 and has consulted with or received instruction from key
authorities in the world on the subject including Dr. Elwyn Berlekamp, Dr.
Robert McEliece, Dr. R.T. Chien, Dr. Edward Weldon and Mr. Neal Glover. He has
more than 25 years experience as an electronics design engineer in the research,
advanced concepts and product development divisions of Control Data, Seagate,
Unisys and Ciprico. He was the first person in the disk industry to implement a
powerful Reed-Solomon error correcting code in a disk controller.
In
November 2007 - ECC Tech announced it was offering to license or sell
its US patent related to high reliability
flash SSD
architecture.
see also:-
ECC
Technologies - editor mentions on STORAGEsearch.com
- editor's comments:- February 2010 -
storage reliability
is not the kind of subject which gets people excited, unlike faster or cheaper
products. That's because - in a fault tolerant design - nothing bad happens.
But leading thinkers in the SSD industry that I've spoken to seem to
agree on one thing... As SSDs get faster - it's even more essential to look at
the design aspects where events in the physical world can cause data errors.
Vendors can lose their markets due to badly or incompetently designed products.
Here's a lesson from server
market history.
In
2001 - Sun Microsystems lost its reputation for reliability.
The
article Unsafe At
Any Speed? - was a contemporary exposé of an easily avoidable design
mistake which affected tens of thousands of high end SPARC servers . That was a
pivotal point for most Sun customers. If they couldn't trust Sun, and Sun's
SPARC servers were slower and more expensive than alternatives - then why
were users risking their own operations by relying on this flaky outfit?
That's when many die-hard SPARC user sites decided that the bitter taste of
migration
planning was better than the risk of being poisoned. |
| . |
|

|
| . |
| There
are
hundreds
of articles about SSDs on StorageSearch.com |
Here, below, are some
examples.
- RAM Cache
Ratios in flash SSDs - it's important to know the underlying RAM cache
architecture - even if you're happy with the R/W and IOPS performance.
- 2010 - 1st Fizz
in the SSD Bubble? - even the dogs in the street know this is going to be a
multibillion dollar market. Greed will play as big a part as technology in
shaping the
SSD year ahead.
- the pros and cons of
using SSD ASAPs - auto tuning SSD appliances are a new category of SSD
which entered the market in the 2nd half of 2009 to accelerate servers without
needing human tune-ups. How can you tell if they are right for you? And how
well do they work?
- the Problem
with Write IOPS - in flash SSDs - long established as a useful performance
modeling metric - this article explains why some specs are exaggerated when
applied to flash SSDs - or predict the wrong results for many common
applications.
| | | |
| Reliable
SSDs are Rocket Science |
Editor:- February 15, 2010 -
NASA's
Solar Dynamics Observatory, launched last week, uses an SSD error
correction architecture designed by ECC Technologies.
Phil
White, inventor of this scheme says - "You can think of the SDO spacecraft
as containing a parallel-transfer, fault-tolerant
SSD that uses DRAM chips
instead of NAND Flash chips. It uses exactly the same PRS ECC that I have
proposed for use in solid state disks. All of the data collected by SDO is
encoded by the PRS encoder, stored in the SSD and decoded by the PRS decoder.
Multiple DRAM chips can fail with no loss of data or performance."
|
|
|
| Here are some more "SSDs
in space" links .
| |
| . |
| Error
Correction in MLC Flash SSD Arrays |
Editor:- October 28, 2009 -
ECC Technologies
has published a new article which examines
data reliability
issues in RAID systems using MLC flash.
In his survey of
RAID and error correction
related to SSDs the author
Phil White said he thinks that "MLC NAND Flash memories should
implement nonbinary error-correcting codes such as a Reed-Solomon (RS) codes so
that all of the bits from one cell are in one symbol. The communications
industry has been doing that for decades, but the Flash industry has been
implementing a scheme that forces the bits from one cell to be in separate
records (pages) so that one cell failure can cause multiple binary symbol
failures which seems illogical."
I asked him to expand on
this for our readers.
In reply - Phil said he doesn't think that
most NAND Flash (SSD) companies have a high level of expertise in the field of
error-correcting codes.
"Many of the
NAND Flash controllers
that are out in the market now have ECC Tek's ECC designs in them. None of the
controller companies who have come to us have any idea how to implement binary
BCH encoders and decoders in hardware. I doubt if any of the Flash
manufacturers have that expertise either."
"For years the
Flash manufacturers implemented a simple binary scheme that corrected only 1 bit
in a page. I don't have evidence to prove this, but I believe the NAND Flash
manufacturers simply decided to extend their original scheme to correct N bits
in instead of 1 bit to handle higher error rate devices. I also believe that
they implemented a scheme for MLC NAND Flash to "randomize" the errors
when a cell fails.
"Consider 4-bits/cell. When a cell fails,
0-4 bits may be in error. In order to keep using binary error-correcting codes
that only correct bits, they designed the chips so that all of the bits from
that cell are in different pages.
"To the best of my knowledge, they never considered using RS
codes so that all of the bits from one cell are in one RS symbol. For example,
assume a RS code with 12-bit symbols. Each RS symbol can hold the data from
three 4-bit cells, and if those three cells happen to fail, it will only corrupt
one RS symbol. RS codes can correct t "symbol" errors and s "symbol"
erasures as long as 2t+s is less than or equal to R where R is the number of "symbols"
of redundancy. The most natural and powerful thing to do is to put all of the
bits from one cell in one RS symbol." ...read the article
See
also:- Data
Integrity Challenges in flash SSD Design - a recently published article by
SandForce. | | |