SAN -
Applications
Applications that require the
transfer or movement of large amounts of data are prime candidates for SAN.
|
 |
SAN on STORAGEsearch.com |
Megabyte
had recently seen the movie City
Slickers and was experimenting with a new data round-up technique. | |
|
. |
What
Is SAN?
No officially recognized definition for SAN exists. |
The Storage Networking Industry Association (SNIA), created in early 1998,
is now defining SAN boundaries and establishing standards to achieve
interoperability. Until SNIA finishes, I suggest that a SAN contains at least 2
servers with access to a storage pool through an interconnection "fabric"
with at least one hub or switch. SANs also require network technologies with
high scalability, performance, and reliability to combine the robustness and
speed of a traditional storage environment with the connectivity of a network.
A
SAN is a specialized
high-speed network that enables fast, reliable access among servers and external
or independent storage resources. In a SAN, all networked servers share storage
devices as peer resources; they are not the exclusive property of any one
server. You can use a SAN to connect servers to storage, servers to each other,
and storage to storage through hubs, switches, and routers. A SAN carries only
I/O traffic between servers and storage devices; it doesn't carry
general-purpose traffic such as email or other end-user applications. Thus, it
avoids the difficult tradeoffs inherent in using a single network for all
applications.
As the SAN concept develops, it is growing beyond identification with
any one technology. Just as LANs use a diverse mix of technologies, so do SANs.
The SAN mix can include Enterprise Systems Connection (ESCON), Fiber Distributed
Data Interface (FDDI), Asynchronous Transfer Mode (ATM), IBM's Serial Storage
Architecture (SSA), and Fibre Channel. SAN architectures also let you use a
number of underlying protocols, including TCP/IP and variants of SCSI. The most
popular implementation of SAN for open systems is based on
SCSI over
Fibre Channel.
You can deploy SANs in both homogeneous and heterogeneous
environments. In a heterogeneous environment, a SAN allows different kinds of
servers-Windows NT, UNIX, and OS/390-to share different kinds of
storage-mainframe disk, tape, and redundant arrays of inexpensive disks (RAID).
With this shared capacity, organizations can acquire, deploy, and use storage
devices more cost-effectively. Ultimately, on a SAN, any data at any network
location is accessible, often via multiple paths, by any nodes, applications, or
users on the network.
SAN Advantages
The first advantage of a SAN is superior connectivity. In a SAN
environment, every server on the network can address all the storage on the
network, for distances up to 10 km. with Fibre Channel support. In fact, any
storage device, if it has enough intelligence, can talk to any other storage
device on the network. This capability enables the utilization of existing
storage resources that may be scattered throughout an enterprise building or
campus, and encourages broader adoption of remote vaulting, mirroring, and
clustering solutions.
SAN allows you to virtually centralize storage. You can use SAN to
isolate physical storage devices from the virtual presentation that clients see.
This separation provides a high level of flexibility in distributing and
reconfiguring resources. Storage on a SAN is shared, resulting in centralized
management, better utilization of disk and tape resources, and enhanced
enterprise-wide data management and protection.
Other architectures can't match the degree of scalability that SAN
brings. Without affecting ongoing operations, storage capacity can grow, its
performance can scale along with its topology, and its availability can increase
by providing redundant systems and data paths. SAN greatly improves performance.
A large, switched Fibre Channel fabric can sustain an aggregate bandwidth in the
gigabyte-per-second (GBps) range with very low latency.
| | |
These applications may refer to horizontal applications (e.g., backup,
archiving, data replication, disaster protection, and data warehousing) or
vertical applications (e.g., online transaction processing (OLTP), enterprise
resource planning (ERP) business applications, electronic commerce,
broadcasting, prepress, medical, and geophysics). SAN is also well suited to
making performance and high availability more scalable and more affordable in
applications such as clustering and data sharing. This article discusses two
major horizontal applications, backup and data sharing, and how they interact
with SAN.
Backup in a SAN Environment
One of the first applications that users want when implementing SAN is to
be able to back up and protect their data through the SAN. They want to offload
heavy backup traffic from the LAN, free system bandwidth for production
operations, and gain the speed and security advantages of centralized management
that SAN offers.
Effectively protecting data on a SAN requires a number of elements.
Many of them are currently in the early stages of implementation. These items
include:
- Centralized management
- Support for sharing removable-media libraries
- LAN-less and server-less backup
- Heterogeneous platform support
- Remote vaulting and mirroring
- Realtime backup
Centralized management: Ideally, a central console would manage all
the logical and physical storage resources of an enterprise network. The console
would automatically collect, correlate, and analyze capacity, configuration,
use, and performance information on all storage resources. The logical resources
monitored would include file systems, directories, files, and
application-specific storage repositories. The physical resources tracked would
include disks, RAID systems, tape libraries, optical jukeboxes, Fibre Channel
components, Network Attached Storage (NAS), and SAN switches and hubs. Nearly
every vendor offers some degree of centralized management. The leaders in this
area are Veritas, Legato, Computer Associates (CA), and IBM.
Support for sharing removable-media libraries: Performing backups often
involves backing up many different servers to locally attached tape drives. One
benefit of SAN and NAS connectivity is the ability to share resources (e.g., a
large tape library) among multiple backup servers. Shared resources enable
administrators to consolidate backups into one tape library.
However, the support must extend beyond simple connectivity to a library and
into management. Managing a library means managing access to the media stored
within it and requires dynamic drive allocation among servers, so the server
that needs a drive most at a given time can get it (e.g., when recovering a
large database). Managing a library involves managing not just backup but any
application that might need access to tape or optical storage. In
many cases, the ability to connect a library to multiple backup servers via the
SAN will justify the expense of automation. In this environment, Hierarchical
Storage Management (HSM) becomes economically desirable. Legato, Veritas, CA,
and Seagate Software are the leaders in developing shared tape-library support.
LAN-less and server-less backup: Backup is evolving in three phases when
it comes to data movement. Currently-the first phase-data moves from the disk,
to the server it directly connects to, through the LAN, to another server that,
in turn, transfers data to the tape. In the second phase, SAN lets you perform
backup outside the LAN. Data moves from the disk to the server, which
retransmits it through the SAN to a SAN-connected library. This setup is
sometimes called LAN-less backup . In the third phase, the server initiates the
backup command. Data moves directly from disk to tape through the SAN fabric
without further involving the server or the LAN. This configuration is called
server-less backup. Intelliguard, which Legato recently acquired, has led the
development of server-less backup.
Heterogeneous platform support: Early SAN implementations are generally
homogeneous. As SAN environments mature, they will become more heterogeneous.
Effective SAN management software will need to be able to manage any vendor's
server communicating with any vendor's storage, hosting any database,
application, or file system, backing up to any tape drive or library, through
any switch, hub, router, or bridge. EMC and Veritas are examples of vendors
supporting heterogeneous platforms.
Remote vaulting and mirroring: The connectivity distances that
Fibre Channel allows-10 to 20 km., depending on usage-make it easier to deploy
remote sites for business comtinuance and disaster recovery purposes. Use of
remote backup, remote vaulting, and remote mirroring techniques are likely to
increase due to this capability. SANs can also connect to WANs to achieve
additional levels of connectivity and protection. CommVault is one of the
vendors offering remote vaulting capability. CNT offers a SAN-to-WAN solution in
SCSI connectivity and Enterprise Systems Connectivity (ESCON), and is also
developing support for remote Fibre Channel.
Realtime (or window-less) backup: The importance of window-less
backup (also called hot backup) becomes obvious when it addresses the large
volume of data in a SAN centralized backup library. Realtime backup essentially
lets you back up a volume or file periodically and automatically without
affecting normal system operations. The technique commonly used is called a
snapshot, where you make a copy of the volume needing backup, and then back up
the copy while accessing and modifying the original volume in normal operations.
Network Integrity leads in development, and EMC and HDS have implemented
solutions in currently available products.. Major providers of total backup
solutions include ADIC, ATL, StorageTek, Hewlett-Packard (HP), Exabyte, and
Overland.
Resource and Data Sharing
In a heterogeneous environment where platforms are by definition different,
the distinction between resource sharing, data copy sharing, and true data
sharing must be made.
Resource sharing: A storage subsystem attached to multiple computer
platforms is divided into partitions, each partition being accessible only to
its owning platform or to a certain number of homogeneous platforms. The
administrator can reassign storage capacity to different platforms as needs
change. One of the benefits of SAN connectivity is its ability to share
resources (e.g., a large tape library) among multiple backup servers. Such
sharing enables administrators to consolidate backups-from many different
servers to locally attached tape drives-into one tape library.
Dynamic resource sharing: All storage is available to any connected
host; hosts are allocated storage as they need it. If one host needs the
storage, it can use any or all the available space. If a host deletes a file,
that space is available to any other host. This dynamic storage sharing operates
automatically and transparently. Dynamic resource sharing means that the systems
administrator doesn't have to partition the storage before storing the data.
Data copy sharing: This process involves replication of the data. Data
is the same across copies at the time of copy creation, but the copies can
change independently afterward. There is no assurance that they will remain
identical. Data access is usually prevented during replication so the copy
accurately reflects all the data at a particular time. For large amounts of
data, the time needed to copy it may be important, , and the amount of storage
necessary to store the copy could be very large. SAN facilitates data-copy
sharing by allowing high-bandwidth connections to transfer large volumes of
data.
True data sharing. If you are sharing data without making a copy,
multiple computer platforms can access the same physical instance of the
recorded data on a storage subsystem. This type of sharing is called true data
sharing. Different levels of performance and complexity exist in implementing
true data sharing: The first level is when heterogeneous platforms can access
data, but only the original data owner can modify it. The second level is when
multiple heterogeneous platforms can update and rewrite a data item, but only
one at a time. In this case, you must use a locking mechanism to momentarily
prevent a platform from updating the data. The third level is called concurrent
data sharing and exists when all platforms can either read or update the data at
the same time. The advantages of true data sharing are numerous. With only one
copy of data, you never need to replicate the data for use elsewhere, you
simplify data maintenance, and you eliminate problems due to out of sync
conditions. True Data Sharing among platforms running heterogeneous operating
systems requires translating to one common operating system (see File management
discussion under SAN Management Software on page XX). Examples of vendors
offering implementations of true data sharing in a SAN architecture are Sequent,
Mercury Computer Systems, DataDirect, Transoft, Retrieve, and Network Disk. In a
NAS architecture, NetApp, EMC, Sun, IBM, and Procom offer true data sharing
solutions. ...Peripheral
Concepts profile, SAN |