Rugged & Reliable Data Storage flash SSDs for military apps


leading the way to the new storage frontier	.....


industrial SSDs	..


top 20 SSD oems	..


military SSDs	..


the fastest SSDs	..


memory channel SSDs	...


SSD controller chips	..

Rugged & Reliable Data Storage: flash SSDs overview

Editor's intro:- All solid state disks are not created equal and it's important to understand the wear out and stress factors which may limit the operational life of these products when used in rugged environments. However, when used correctly they may typically outlast a hard drive by a factor of about 3x. In smaller storage capacities the SSD takes up less space, is lighter, uses less power, runs faster and quieter.... and may in fact cost less than the alternative.

Rugged & Reliable Data Storage: flash SSDs overview

by Ofer Tsur, Marketing Manager SSD Business Unit, M-Systems

(this classic article was published here in May 2002)

Reliable data storage is a major concern for engineers designing military systems as within Radar and Sonar systems, servers, data recorders, tactical computers, moving maps, fire-control systems, Airborne Reconnaissance Systems and others, all operating under harsh environment conditions in tracked and wheeled vehicles, on airborne and shipboard. Rotating mechanical disks cannot fit as rugged and reliable data storage solution for military applications. Mechanical disks do provide very large storage capacity of more than 70GB with low price less than $150, but bare reliability limitations.

F-SSDs go where winchester disks fear to tread

Solid-State Flash Disk operates at industrial temperatures range of -40°C to +85°C and storage temperature range of -55°C to +95°C.

Rotating hard disk mechanizim is based on spinning platters and head-arms which read/write the information from/to the disk platters and its mechanical design affects its data integrity at harsh environment conditions. Mechanical disks operating temperature range is limited to +5°C to +55°C and as such cannot be used at extreme temperature conditions.

Military applications require in most cases operating temperature range is -40°C to +85°C, known as industrial temperature range. Another drawback is the low shocks and vibrations figures that mechanical disks can observe while operating. Mechanical disks operate under maximum shocks figures of 125G for 2.5" IDE/ATA disk (common within laptops) and up to 65G for SCSI disks (common within servers/desktops). Operating vibrations figures of rotating mechanical disks is up to 1G. Both Shocks and vibrations figures of mechanical disks cannot comply with the MIL-STDs requirements on tracked and wheeled vehicles, on airborne and shipboard. To overcome the limited reliability of the mechanical disks other solutions have been introduced in the marketplace.

Ruggedizing rotating mechanical disks

Ruggedized mechanical disk solutions are based on rotating mechanical disks sealing in rigid cartridge. The sealed aluminum housing encompasses the entire disk mechanism, including electronics, protecting it from high humidity and changes in altitude. Advanced sealed cartridge includes embedded closed loop servo system, which automatically compensates for temperature variation, ensuring reliable head positioning over the entire operating temperature range. Ruggedized mechanical disk solutions provide high capacity available within mechanical disks of 70GB with fast-sustained read/write performance up to 40MB/s rates.

The cost of ruggedizing mechanical disks ranges from hundreds dollars up to thousands dollars, depending on the degree of ruggedization level required. For high capacity disk of more than 30GB it can be a cost-effective solution but for smaller capacities the cost of ruggedazation it too dear per MB/GB. Sealed cartridge improves the environmental figures of shocks, vibrations and temperatures range of the mechanical disk but additional factors need to be considered. Sealing rotating disk causes the size of the unit to be doubled or even tripled and in addition excessive weight is being added due to the sealed cartridge. Larger unit size and excessive weight are tough constraints in airborne applications within helicopters & fighters and each additional pound and additional square mm causes a high dollars premium.

Solid State Flash Disk has no moving parts, which enables it to comply with MIL-STD 810 for shock and vibration.

Solid-State Flash Disk solution

Solid State Disks (SSD) have no moving parts, hence eliminating seek time, latency and other electro-mechanical delays inherent in conventional disk drives. Based on flash technology, solid-state flash disks are becoming common data storage within military and airborne systems, telecommunication infrastructure and factory automation systems as they offer significantly higher reliability and a maintenance free solution than traditional mechanical disks.

Flash is a non-volatile memory vs. volatile memory of DRAM, SRAM and SDRAM. Like mechanical disks functionality, non-volatile flash memory technology retains the data when power is off. Flash memory is becoming an ideal solution for replacing mechanical disks when reliability is a key requirement. In order for solid-state flash disk (F-SSD) to provide true "drop-in replacement" solution for mechanical disks, Solid-state flash disk have identical dimensions as mechanical disks, same mounting holes and same interfaces.

Today F-SSD solutions are mostly common in two forms factor of 2.5" (laptop size disk) with IDE/ATA interface and 3.5" (desktop size disk) in Narrow SCSI and Wide SCSI interfaces.

As space limitations can be a constraint in several military applications, trimmed casing of F-SSD is provided in which 2.5" F-SSD is available down to 8mm case height and 3.5" SCSI/Wide SCSI is available down to 17mm case height. Flash disk space determines the maximum capacity that can be provided. The maximum capacity that can be fitted in today's 8mm case height of 2.5" form factor is 1GB while more than 4GB can be fitted in standard height of 17mm 2.5"casing. 10GB is the maximum capacity that can be fitted today in 3.5" case in a standard 1-inch case height. The flash industry has its own version of Moore's law in which doubled capacity of the flash components within the same footprint every 12 months, enabling to double the capacity of the F-SSD every year in same case size and in addition reduce of flash costs.

The added value of Solid-State Flash Disk

Solid State Disks (SSD) has no moving parts, hence can provide data integrity under harsh environment conditions. Solid State Flash Disks (F-SSD) can operate at shocks conditions of 1500G and vibrations conditions of 16G, complying with MIL-STD 810C&E. As such, F-SSD has no altitude limitations and F-SSDs are being deployed in fighters and helicopters. F-SSD operates at extreme humidity conditions of 5% to 95%, ideal for Navy-shipboard conditions. Operating at high temperature range is one of the major added-value of F-SSD, which can operates at industrial range of -40°C to +85°C with storage temperature range of -55°C to +95°C.

Maintaining military applications can be very costly especially in airborne, shipboards and remote installations. In addition the maintenance process may cause systems to be inoperative which is not tolerated in mission-critical applications. Solid-state flash disks provides up to 10 times the MTBF of mechanical disks with more than 1 millions MTBF hours. Solid-State flash disks provide data retention of more than 10 years vs. the 3 years of mechanical hard disks enables M-Systems, a flash disks manufacturer to provide as standard 5 years warranty.

Power consumption can be a big constrain in applications that are not connected to the external power network. For example portable devices are being operated on batteries as tactical computers, laptops and portable communication equipment. Unlike mechanical disks which consume high power in order to initiate the engine of the spinning platters, solid-state flash disks has no moving parts. Consequently the power consumption of solid-state disks is about one fourth of the power consumption of rotating hard disks.

Securing confidential data is essential: as the damage that can be caused when it falls into the "wrong hands" is devastating. Deleting files from a mechanical disk does not actually erase the data as only the File Allocation Table (FAT) is being updated but the data still resides within the disk. To truly erase the data, mechanical disks must be formatted, which may take one hour or more... time that may be much too long duringan emergency. In addition, even formatted disk can be restored and data can eventually be retrieved. Some Solid-State Flash Disk designs enable users to erase the entire disk in typically 5 seconds. This special feature, known as Fast/Quick Security Erase can be operated via software interrupt or via a hardware interrupt and is one of the solid-state flash disk's lucrative added values.

The breakthrough in Solid-State Flash Disks

Limited capacity and very high flash costs were holding designers in the past from using solid-sate flash disks. Flash manufacturers as Toshiba and Samsung improved their process in the past 3 years by using less silicon, reducing costs and increasing flash capacity.

Solid-State Flash Disks now support Ultra Wide SCSI interface in 80-pin SCA-2 connector and 68-pin connector with more than 20MB/s sustained read & write rates. The 80-pin SCA connector also supports hot swapping.

In 1999 F-SSDs were sold at $5 per MB (1GB F-SSD at $5,000) while in 2000 F-SSDs were sold at $3 per MB (1GB F-SSD at $3,000) and in 2001 F-SSD were sold at $2 per MB (1GB F-SSD at $2,000). This price trend continues and today the cost of F-SSD is less than $1.5 per MB and the future continue to look bright.

In the last year there has been a breakthrough in Solid-State flash disk performance. Until 2000 only slow F-SSDs were available, providing Narrow SCSI up to 3MB/s for the sustained read/write rates. In 2001 Ultra Wide SCSI solid-state flash disks were introduced providing more than 20MB/s sustained read/write rates. This enabled users to store video applications on solid-state flash disks, as video applications required at least 6MB/s for the sustained read/write rate. As the trend in NAND flash costs continues heading downwards, and Flash based solid-state disk manufacturers continue to introduce higher capacities and faster devices, it is only a matter of time until F-SSD will resident in every mission-critical application that requires reliable data storage.

Choosing the right Solid-State Flash Disk

Beware, choosing the right F-SSD solution isn't that simple! There's much more to it than just choosing the right storage interface, disk capacity and performance. NAND flash technology has inherent technology limitations and as mission-critical applications require top reliability, F-SSD must be accompanied by special mechanisms to overcome these potential problems.

Flash is non re-writable and, meaning that a data bit must first be erased before it can be written again. Flash is erased in blocks (a typical block size is 4 to 64 Kbytes), which are much larger than disk sectors (512 bytes). In addition, flash has a limited number of erase cycles of about 300,000 depending on the process. But flash has no limitation for read operation.

...Later:- see also the March 2007 article- SSD Myths and Legends - "write endurance"

Flash manufactures such as Toshiba and Samsung continue to improve their process by using less silicon to reduce costs, NAND Flash media may accumulate up to 2% of bad blocks during its manufacturing process and additional bad blocks during the flash operational usage. Flash manufacturers guarantee less than 0.04% of bad blocks out of the total available. Deterioration in the number of accumulated bad blocks is also a factor over time. Flash accumulates bad blocks during the write/erase operation as electrons are captured in an oxide layer and create internal electrical stress due to potential voltage difference between different blocks inside the flash chip. The stress failure probability, due to this increased voltage difference, is increased when only one block is being accessed by erase/write operations over and over again while the rest of the blocks are untouched. The stress due to the uneven erase/write operation among the flash blocks increases the bad blocks accumulated.

Enhancing Solid-State Flash Disk Endurance

Some F-SSD manufacturers incorporate methods to enhance SSD endurance. A most common technique is known by the name of "Erase before write" or "Counter wear leveling". This method implements a counter for every flash block, counting the number of erase write cycles. Every time that a new write operation is being executed, an erase operation is being done first to the block where the data will be stored and the block counter number is being updated. When the specific counter reaches the flash erase cycles limitation the block is being marked as a bad block. As a result the data in that block can be continued to be read unlimited number of times, but the block cannot be written used anymore. for additional write operations.

When the application tries to execute a new write operation to a specific block which has already reached its erase cycles limit, the new data is stored in a different block, taken from a pool of spare blocks, and a pointer is assigned pointing the location of the new data. The "erase before write" algorithm is intensively wearing out the flash during time, especially as the whole erasable block is being erased every time, even though only part of the data in that block is being updated. If the application forces write operations to the same disk location over and over again, those blocks will eventually reach the erase cycles limitat and over time total disk capacity available for write operation will be decreased. More advanced methods of enhancing F-SSD endurance to overcome these flash limitations are known by the names TrueFFS® (True Flash File System), "virtual mapping" and "Dynamic Wear Leveling" algorithms and "Garbage Collection". M-Systems uses these techniques in its Fast Flash Disks (FFD).

The "Dynamic Wear Leveling" algorithm guarantees the use of all flash components in the disk at the same level of the erase cycles. The algorithm eliminates situations where the application repeatedly writes to the same location over and over again until flash blocks wear-out. The dynamic wear leveling algorithm guarantees that all flash blocks will be erased the same number of times and as a result the available capacity for write operation is unchanged. The Dynamic Wear Leveling algorithm eliminates situations where the application repeatedly writes to the same physical location over and over again until flash blocks wear-out and the capacity available for write operation is being reduced over time. The TrueFFS® Dynamic Wear Leveling algorithm is performed by a dynamic virtual mapping of logical sectors to physical blocks, transparently to the user's application.

The "Garbage Collection Process" eliminates the need to perform erasure of the whole block prior to every write. The "Garbage Collection process" accumulates data marked for erase as "Garbage" and perform whole block erase as space reclamation in order to reuse the block. Once a block reaches its own limit of erase cycles or indicates that a problem, has been found, the embedded "Bad Block Mapping-Out" algorithm (BBM) marks the block as a "bad block" and the TrueFFS® does not use that block anymore and instead replaces it with a spare block. Larger pools of spare blocks (reaching up to 4% of the F-SSD capacity) are used to replace bad blocks and thereby increase disk endurance. Incorporating TrueFFS® "Dynamic Wear Leveling", "Garbage Collection" process and the "Bad Block Mapping-out" algorithms optimize flash usage with minimum erase cycles, enhancing F-SSD endurance while keeping media size available for write operation without a decrease in capacity over time.

Improving Solid-State Flash Disk reliability

Some F-SSD manufacturers use DRAM/SRAM data buffers in their design to increase disk performance. As DRAM/SRAM is a volatile memory, powering down when the while disk is being written or when data resides in the cache may cause an incomplete write sequence. (Although that risk can be eliminated in a good system design - editor.) DRAM/SRAM cache buffer also causes disk performance to decline when the cache buffer is full (during write operations) and if the data does not reside in the cache (during read operations). F-SSD has no volatile data caching so disk reliability is increased under unstable power conditions failure situations and will provide sustained R/W rates undisturbed by cache status.

Power cycling may cause data corruption even if no volatile caching is used as a data caching. It is important to verify test that the F-SSD mechanism does not tolerate "in-between" states of data caused by a power failure when only part of the data was transferred written to the disk during the write operation until a power failure occurred but the disk mapping indicates a different scenario. However, data reliability can be preserved during unstable power cycling conditions using the following scheme. During a disk write operation, the disk controller should verify that the new data has been stored, by a transparent internal read operation, and only then will the flash mapping information be updated. If the block was not completely transferred to the disk media during the write operation then the F-SSD must not update the mapping as "block successfully transferred". Mapping must reflect all time all time the correct status of the write operationof the disk. Error Detection Code (EDC) and Error Correction Code (ECC) are used in mechanical disks and SSDs to detect and correct errors occurs during read/write operation. In general the EDC/ECC algorithms required used by mechanical disks are more powerful than the ones required used by SSDs, as the probability of error with magnetic media is much greater than with flash media due to their design. A most common algorithm incorporated in F-SSDs for EDC/ECC is the Reed-Solomon implemented by H/W and S/W using 24bit/sector, 32bit/sector or 48bit/sector. For example 48-bit Reed-Solomon algorithm provides read bit error rate equals to 10^-14.

(Editor's note:- Since this article was written, one manufacturer, AMD, has also announced error correction at the raw flash bit level. In AMD's MirrorBit cell, code or data is stored in two discrete and independent locations. By physically separating each bit and maintaining its individual integrity, AMD's MirrorBit devices are inherently more stable and reliable than competing multi-level cell (MLC) devices. This is transparent to any overlying error detection scheme.)

Some F-SSD manufacturers are providing hybrid designs by incorporating several units of Compact Flash (CF) or units of PC-Cards (PCMCIA ATA Cards) to compose a flash solid-state disk. The Hybrid design is less expensive as it enables the manufacturers to use most common CF/ & PCC units available in the marketplace at a very attractive prices. Some hybrid F-SSD designs may cause reliability problems under shock & and vibration conditions as the hybrid F-SSD is a LEGO type product based on several sub-units of CF/PCC. As CF and PCC supports the ATA/IDE protocol, a SCSI F-SSD based on CF/PCC sub-units faces the need for protocol conversion. This IDE to SCSI conversion may lower disk performance.

As F-SSDs are designed to operate in mission-critical systems for many years, and taking out the disk for status checking is unacceptable; remote monitoring of the internal status is needed. One example of remote monitoring feature is the "SMART"(Self-monitoring, Analysis and Reporting Technology). By activating the "SMART" software command feature, the disk performs internal monitoring tests and reports back the latest results, indicating the status of the disk. The "SMART" command is common in mechanical ATA/IDE disks and it tests, among other things, the mechanical disk rotation. As F-SSD does not have moving parts, but on the other hand does accumulate bad blocks over time, some F-SSD manufacturers have used the "SMART" command to analyse the F-SSD bad blocks status. The total number of bad blocks that have been accumulated since the F-SSD was manufactured, relative to the disk total capacity can be returned as status information to the user. Accumulating bad blocks over time provides the user with an indication of the F-SSD reliability and expected life span in that system.

Summary

Mechanical disks continue to be the bottleneck in total systems reliability under harsh environment conditions. Solid-state flash disks are being designed as true "drop-in replacement" data storage solution for mission-critical applications. Solid-state flash disks provide data integrity under harsh environment conditions of extreme shock, vibration and humidity, operating over the industrial temperature range. Although the process of ruggedizing mechanical disks can improve their environmental figures, this is at the expense of larger unit size, excessive weight and increased manufacturing cost. On the other hand, solid-state flash disk technology provides smaller casing with minimal power consumption and its Fast/Quick security erase feature enables solid-state flash disk to erase the entire disk in 5 seconds. Cost is no longer the harsh barrier in using F-SSD in mission-critical applications as flash prices have declined dramatically in the last two years and experts expect the decline trend to continue. Higher capacities and faster performance will also come from process enhancements which reduce geometries, as with traditional RAM technologies

But it's the superior reliability which makes the solid-state flash disk today an ideal solution for ground, sea and air military applications.

...M-Systems profile

For more information about SSDs take a look at these resources

military grade 2.5" SATA SLC SSDs
-45C to 90C / quick erase 512GB in <15S
from Cactus Technologies

Increasing Flash SSD Reliability
SSD Reliability - collected papers
Bad block management in flash SSDs
Are MLC SSDs Safe in Enterprise Apps?
SSD Myths and Legends - "write endurance"
some thoughts about the business of SSD customization

Even a modest amount of drive writes per day can render modern day MLC flash incapable of retaining data for long in the unpowered state - depending on the temperature in the rack where those writes took place.

This effectively means that the flash inside the SSD is no longer "non volatile".

Aspects of DWPD - October 23, 2014

popular SSD articles on StorageSearch.com

SSD Myths - "write endurance" - In theory the problems are now well understood - but solving them presents a challenge for each new chip generation.

SSDs replacing HDDs? - That's a gross simplification.

the Top 20 SSD companies - updated quarterly.

the Fastest SSDs - in each form factor. Speed is still the #1 reason for buying SSDs.

the SSD Buyers Guide - summarizes key SSD market developments in the past quarter and has a top level directory of SSD content.

PCIe SSDs - news and market commentary. We've reported on PCIe SSDs since the first products shipped in 2007.

is eMLC the true successor to SLC in enterprise flash SSD?- which so called "enterprise MLC" tastes the sweetest?

Clarifying SSD Pricing - where does all the money go?

the SSD Reliability Papers - links and abstracts of articles related to the subject of SSD reliability and data integrity.

the problem with Write IOPS in flash SSDs - this classic article helps you understand why all flash SSD benchmarks incorrectly suggest you're going to get much higher performance than you will actually see in your apps.

Data Integrity Challenges in flash SSD Design - looks into the detailed techniques to achieve data reliability.

STORAGEsearch is published by ACSL