Home
Book Fire Online by http://bookfire.net Prev Page Prev Page
Table of Contents
Back Cover
The Essentials of Computer Organization and Architecture
Preface
To the Instructor
Chapter 1: Introduction
1.2 The Main Components of a Computer
1.3 An Example System - Wading through the Jargon
1.4 Standards Organizations
1.5 Historical Development
1.6 The Computer Level Hierarchy
1.7 The Von Neumann Model
1.8 Non-Von Neumann Models
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 2: Data Representation in Computer Systems
2.2 Positional Numbering Systems
2.3 Decimal To Binary Conversions
2.4 Signed Integer Representation
2.5 Floating-Point Representation
2.6 Character Codes
2.7 Codes For Data Recording And Transmission
2.8 Error Detection And Correction
Chapter Summary
Further Reading
References
Review Of Essential Terms And Concepts
Exercises
Chapter 3: Boolean Algebra and Digital Logic
3.2 Boolean Algebra
3.3 Logic Gates
3.4 Digital Components
3.5 Combinational Circuits
3.6 Sequential Circuits
3.7 Designing Circuits
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Focus on Karnaugh Maps
Chapter 4: MARIE : An Introduction to a Simple Computer
4.2 Marie
4.3 Instruction Processing
4.4 A Simple Program
4.5 A Discussion on Assemblers
4.6 Extending Our Instruction Set
4.7 A Discussion on Decoding — Hardwired vs. Microprogrammed Control
4.8 Real World Examples of Computer Architectures
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 5: A Closer Look at Instruction Set Architectures
5.2 Instruction Formats
5.3 Instruction Types
5.4 Addressing
5.5 Instruction-Level Pipelining
5.6 Real-World Examples of ISAs
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 6: Memory
6.2 Types of Memory
6.3 The Memory Hierarchy
6.4 Cache Memory
6.5 Virtual Memory
6.6 A Real-World Example of Memory Management
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 7: Input/Output and Storage Systems
7.2 Amdahl's Law
7.3 I/O Architectures
7.4 Magnetic Disk Technology
7.5 Optical Disks
7.6 Magnetic Tape
7.7 RAID
7.8 Data Compression
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Focus on Selected Disk Storage Implementations
Chapter 8: System Software
8.2 Operating Systems
8.3 Protected Environments
8.4 Programming Tools
8.5 Java — All of the Above
8.6 Database Software
8.7 Transaction Managers
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 9: Alternative Architectures
9.2 RISC Machines
9.3 Flynn's Taxonomy
9.4 Parallel and Multiprocessor Architectures
9.5 Alternative Parallel Processing Approaches
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Chapter 10: Performance Measurement and Analysis
10.2 The Basic Computer Performance Equation
10.3 Mathematical Preliminaries
10.4 Benchmarking
10.6 Disk Performance
Chapter Summary
Further Reading
References
Review Of Essential Terms And Concepts
Exercises
Chapter 11: Network Organization and Architecture
11.2 Early Business Computer Networks
11.3 Early Academic and Scientific Networks — The Roots and Architecture of the Internet
11.5 Network Protocols II — TCP/IP Network Architecture
11.6 Network Organization
11.7 High-Capacity Digital Links
11.8 A Look at the Internet
Chapter Summary
Further Reading
References
Review of Essential Terms and Concepts
Exercises
Appendix A: Data Structures and the Computer
A.2 Fundamental Structures
A.3 Trees
A.4 Network Graphs
Summary
Further Reading
References
Exercises
Glossary
Glossary Numbers
Glossary A
Glossary B
Glossary C
Glossary D
Glossary E
Glossary F
Glossary G
Glossary H
Glossary I
Glossary J
Glossary K
Glossary L
Glossary M
Glossary N
Glossary O
Glossary P
Glossary Q
Glossary R
Glossary S
Glossary T
Glossary U
Glossary V
Glossary W
Glossary Z
Answers and Hints for Selected Exercises
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Appendix A
Index
Index A
Index B
Index C
Index D
Index E
Index F
Index G
Index H
Index I
Index J
Index K
Index L
Index M
Index N
Index O
Index P
Index Q
Index R
Index S
Index T
Index U
Index V
Index W
Index X
Index Z
List of Figures
List of Tables
List of Code Examples
List of Sidebars
Team LiB
Previous Section Next Section

7.7 RAID

In the 30 years following the introduction of IBM's RAMAC computer, only the largest computers were equipped with disk storage systems. Early disk drives were enormously costly and occupied a large amount of floor space in proportion to their storage capacity. They also required a strictly controlled environment: Too much heat would damage control circuitry, and low humidity caused static buildup that could scramble the magnetic flux polarizations on disk surfaces. Head crashes, or other irrecoverable failures, took an incalculable toll on business, scientific, and academic productivity. A head crash toward the end of the business day meant that all data input had to be redone to the point of the last backup, usually the night before.

Clearly, this situation was unacceptable and promised to grow even worse as everyone became increasingly reliant upon electronic data storage. A permanent remedy was a long time coming. After all, weren't disks as reliable as we could make them? It turns out that making disks more reliable was only part of the solution.

In their 1988 paper "A Case for Redundant Arrays of Inexpensive Disks," David Patterson, Garth Gibson, and Randy Katz of the University of California at Berkeley coined the word RAID. They showed how mainframe disk systems could realize both reliability and performance improvements if they would employ some number of "inexpensive" small disks (such as those used by microcomputers) instead of the single large expensive disks (SLEDs) typical of large systems. Because the term inexpensive is relative and can be misleading, the proper meaning of the acronym is now generally accepted as Redundant Array of Independent Disks.

In their paper, Patterson, Gibson, and Katz defined five types (called levels) of RAID, each having different performance and reliability characteristics. These original levels were numbered 1 through 5. Definitions for RAID levels 0 and 6 were later recognized. Various vendors have invented other levels, which may in the future become standards also. These are usually combinations of the generally accepted RAID levels. In this section, we briefly examine each of the seven RAID levels as well as a couple of hybrid systems that combine different RAID levels to meet particular performance or reliability objectives.

7.7.1 RAID Level 0

RAID Level 0, or RAID-0, places data blocks in stripes across several disk surfaces so that one record occupies sectors on several disk surfaces, as shown in Figure 7.20. This method is also called drive spanning, block interleave data striping, or disk striping. (Striping is simply the segmentation of logically sequential data so that segments are written across multiple physical devices. These segments can be as small as a single bit, as in RAID-0, or blocks of a specific size.)

Click To expand
Figure 7.20: A Record Written Using RAID-0, Block Interleave Data Striping With No Redundancy

Because it offers no redundancy, of all RAID configurations, RAID-0 offers the best performance, particularly if separate controllers and caches are used for each disk. RAID-0 is also very inexpensive. The problem with RAID-0 lies in the fact that the overall reliability of the system is only a fraction of what would be expected with a single disk. Specifically, if the array consists of five disks, each with a design life of 50,000 hours (about six years), the entire system has an expected design life of 50,000 / 5 = 10,000 hours (about 14 months). As the number of disks increases, the probability of failure increases to the point where it approaches certainty. RAID-0 offers no-fault tolerance as there is no redundancy. Therefore, the only advantage offered by RAID-0 is in performance. Its lack of reliability is downright scary. RAID-0 is recommended for non-critical data (or data that changes infrequently and is backed up regularly) that requires high-speed reads and writes, and low cost, and is used in applications such as video or image editing.

7.7.2 RAID Level 1

RAID Level 1, or RAID-1 (also known as disk mirroring), gives the best failure protection of all RAID schemes. Each time data is written it is duplicated onto a second set of drives called a mirror set, or shadow set (as shown in Figure 7.21). This arrangement offers acceptable performance, particularly when the mirror drives are synchronized 180° out of rotation with the primary drives. Although performance on writes is slower than that of RAID-0 (because the data has to be written twice), reads are much faster, because the system can read from the disk arm that happens to be closer to the target sector. This cuts rotational latency in half on reads. RAID-1 is best suited for transaction-oriented, high-availability environments, and other applications requiring high-fault tolerance, such as accounting or payroll.

Click To expand
Figure 7.21: RAID-1, Disk Mirroring Data Drives

7.7.3 RAID Level 2

The main problem with RAID-1 is that it is costly: You need twice as many disks to store a given amount of data. A better way might be to devote one or more disks to storing information about the data on the other disks. RAID-2 defines one of these methods.

RAID-2 takes the idea of data striping to the extreme. Instead of writing data in blocks of arbitrary size, RAID-2 writes one bit per strip (as shown in Figure 7.22). This requires a minimum of eight surfaces just to accommodate the data. Additional drives are used for error-correction information generated using a Hamming code. The number of Hamming code drives needed to correct single-bit errors is proportionate to the log of the number of data drives to be protected. If any one of the drives in the array fails, the Hamming code words can be used to reconstruct the failed drive. (Obviously, the Hamming drive can be reconstructed using the data drives.)

Click To expand
Figure 7.22: RAID-2, Bit Interleave Data Striping with a Hamming Code

Because one bit is written per drive, the entire RAID-2 disk set acts as though it were one large data disk. The total amount of available storage is the sum of the storage capacities of the data drives. All of the drives—including the Hamming drives—must be synchronized exactly, otherwise the data becomes scrambled and the Hamming drives do no good. Hamming code generation is time-consuming; thus RAID-2 is too slow for most commercial implementations. In fact, most hard drives today have built-in CRC error correction. RAID-2, however, forms the theoretical bridge between RAID-1 and RAID-3, both of which are used in the real world.

7.7.4 RAID Level 3

Like RAID-2, RAID-3 stripes (interleaves) data one bit at a time across all of the data drives. Unlike RAID-2, however, RAID-3 uses only one drive to hold a simple parity bit, as shown in Figure 7.23. The parity calculation can be done quickly in hardware using an exclusive OR (XOR) operation on each data bit (shown as bn) as follows (for even parity):

Click To expand
Figure 7.23: RAID-3: Bit Interleave Data Striping with Parity Disk

Parity = b0 XOR b1 XOR b2 XOR b3 XOR b4 XOR b5 XOR b6 XOR b7

Equivalently,

Parity = b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7 (mod 2).

A failed drive can be reconstructed using the same calculation. For example, assume that drive number 6 fails and is replaced. The data on the other seven data drives and the parity drive are used as follows:

b6 = b0 XOR b1 XOR b2 XOR b3 XOR b4 XOR b5 XOR Parity XOR b7

RAID-3 requires the same duplication and synchronization as RAID-2, but is more economical than either RAID-1 or RAID-2 because it uses only one drive for data protection. RAID-3 has been used in some commercial systems over the years, but it is not well suited for transaction-oriented applications. RAID-3 is most useful for environments where large blocks of data would be read or written, such as with image or video processing.

7.7.5 RAID Level 4

RAID-4 is another "theoretical" RAID level (like RAID-2). RAID-4 would offer poor performance if it were implemented as Patterson et al. describe. A RAID-4 array, like RAID-3, consists of a group of data disks and a parity disk. Instead of writing data one bit at a time across all of the drives, RAID-4 writes data in strips of uniform size, creating a stripe across all of the drives, as described in RAID-0. Bits in the data strip are XORed with each other to create the parity strip.

You could think of RAID-4 as being RAID-0 with parity. However, adding parity results in a substantial performance penalty owing to contention with the parity disk. For example, suppose we want to write to Strip 3 of a stripe spanning five drives (four data, one parity), as shown in Figure 7.24. First we must read the data currently occupying Strip 3 as well as the parity strip. The old data is XORed with the new data to give the new parity. The data strip is then written along with the updated parity.

Click To expand
Figure 7.24: RAID-4, Block Interleave Data Striping with One Parity Disk

Imagine what happens if there are write requests waiting while we are twiddling the bits in the parity block, say one write request for Strip 1 and one for Strip 4. If we were using RAID-0 or RAID-1, both of these pending requests could have been serviced concurrently with the write to Strip 3. Thus, the parity drive becomes a bottleneck, robbing the system of all potential performance gains offered by multiple disk systems.

Some writers have suggested that the performance of RAID-4 can be improved if the size of the stripe is optimized with the record size of the data being written. Again, this might be fine for applications (such as voice or video processing) where the data occupy records of uniform size. However, most database applications involve records of widely varying size, making it impossible to find an "optimum" size for any substantial number of records in the database. Because of its expected poor performance, there are no commercial implementations of RAID-4.

7.7.6 RAID Level 5

Most people agree that RAID-4 would offer adequate protection against single-disk failure. The bottleneck caused by the parity drives, however, makes RAID-4 unsuitable for use in environments that require high transaction throughput. Certainly, throughput would be better if we could effect some sort of load balancing, writing parity to several disks instead of just one. This is what RAID-5 is all about. RAID-5 is RAID-4 with the parity disks spread throughout the entire array, as shown in Figure 7.25.

Click To expand
Figure 7.25: RAID-5, Block Interleave Data Striping with Distributed Parity

Because some requests can be serviced concurrently, RAID-5 provides the best read throughput of all of the parity models and gives acceptable throughput on write operations. For example, in Figure 7.25, the array could service a write to drive 4 Strip 6 concurrently with a write to drive 1 Strip 7 because these requests involve different sets of disk arms for both parity and data. However, RAID-5 requires the most complex disk controller of all levels.

Compared with other RAID systems, RAID-5 offers the best protection for the least cost. As such, it has been a commercial success, having the largest installed base of any of the RAID systems. Recommended applications include file and application servers, email and news servers, database servers, and Web servers.

7.7.7 RAID Level 6

Most of the RAID systems just discussed can tolerate at most one disk failure at a time. The trouble is that disk drive failures in large systems tend to come in clusters. There are two reasons for this. First, disk drives manufactured at approximately the same time reach the end of their expected useful life at approximately the same time. So if you are told that your new disk drives have a useful life of about six years, you can expect problems in year six, possibly concurrent failures.

Second, disk drive failures are often caused by a catastrophic event such as a power surge. A power surge hits all the drives at the same instant, the weakest ones failing first, followed closely by the next weakest, and so on. Sequential disk failures like these can extend over days or weeks. If they happen to occur within the Mean Time To Repair (MTTR), including call time and travel time, a second disk could fail before the first one is replaced, thereby rendering the whole array unserviceable and useless.

Systems that require high availability must be able to tolerate more than one concurrent drive failure, particularly if the MTTR is a large number. If an array can be designed to survive the concurrent failure of two drives, we effectively double the MTTR. RAID-1 offers this kind of survivability; in fact, as long as a disk and its mirror aren't both wiped out, a RAID-1 array could survive the loss of half of its disks.

RAID-6 provides an economical answer to the problem of multiple disk failures. It does this by using two sets of error-correction strips for every rank (or horizontal row) of drives. A second level of protection is added with the use of Reed-Soloman error-correcting codes in addition to parity. Having two error-detecting strips per stripe does increase storage costs. If unprotected data could be stored on N drives, adding the protection of RAID-6 requires N + 2 drives. Because of the two-dimensional parity, RAID-6 offers very poor write performance. A RAID-6 configuration is shown in Figure 7.26.

Click To expand
Figure 7.26: RAID-6, Block Interleave Data Striping with Dual Error Protection

Until recently, there were no commercial deployments of RAID-6. There are two reasons for this. First, there is a sizeable overhead penalty involved in generating the Reed-Soloman code. Second, twice as many read/write operations are required to update the error-correcting codes resident on the disk. IBM was first (and only, as of now) to bring RAID-6 to the marketplace with their RAMAC RVA 2 Turbo disk array. The RVA 2 Turbo array eliminates the write penalty of RAID-6 by keeping running "logs" of disk strips within cache memory on the disk controller. The log data permits the array to handle data one stripe at a time, calculating all parity and error codes for the entire stripe before it is written to the disk. Data is never rewritten to the same stripe that it occupied prior to the update. Instead, the formerly occupied stripe is marked as free space, once the updated stripe has been written elsewhere.

7.7.8 Hybrid RAID Systems

Many large systems are not limited to using only one type of RAID. In some cases, it makes sense to balance high-availability with economy. For example, we might want to use RAID-1 to protect the drives that contain our operating system files, whereas RAID-5 is sufficient for data files. RAID-0 would be good enough for "scratch" files used only temporarily during long processing runs and could potentially reduce the execution time of those runs owing to the faster disk access.

Sometimes RAID schemes can be combined to form a "new" kind of RAID. RAID-10 is one such system. It combines the striping of RAID-0 with the mirroring of RAID-1. Although enormously expensive, RAID-10 gives the best possible read performance while providing the best possible availability. Despite their cost, a few RAID-10 systems have been brought to market with some success.

After reading the foregoing sections, it should be clear to you that higher-numbered RAID levels are not necessarily "better" RAID systems. Nevertheless, many people have a natural tendency to think that a higher number of something always indicates something better than a lower number of something. For this reason, an industry association called the RAID Advisory Board (RAB) has recently reorganized and renamed the RAID systems that we have just presented. We have chosen to retain the "Berkeley" nomenclature in this book because it is more widely recognized.


Team LiB
Previous Section Next Section
Linking to Www Google.Com. Host by Book Fire