Reliability of Computer Systems and Networks : Fault Tolerance, Analysis, and Design


Martin L. Shooman
Bok Engelsk 2003 · Electronic books.
Annen tittel
Utgitt
Hoboken : : Wiley, , 2003.
Omfang
1 online resource (552 p.)
Opplysninger
Description based upon print version of record.. - RELIABILITY OF COMPUTER SYSTEMS AND NETWORKS; CONTENTS; Preface; 1 Introduction; 1.1 What is Fault-Tolerant Computing?; 1.2 The Rise of Microelectronics and the Computer; 1.2.1 A Technology Timeline; 1.2.2 Moore's Law of Microprocessor Growth; 1.2.3 Memory Growth; 1.2.4 Digital Electronics in Unexpected Places; 1.3 Reliability and Availability; 1.3.1 Reliability Is Often an Afterthought; 1.3.2 Concepts of Reliability; 1.3.3 Elementary Fault-Tolerant Calculations; 1.3.4 The Meaning of Availability; 1.3.5 Need for High Reliability and Safety in Fault-Tolerant Systems. - 1.4 Organization of the Book1.4.1 Introduction; 1.4.2 Coding Techniques; 1.4.3 Redundancy, Spares, and Repairs; 1.4.4 N-Modular Redundancy; 1.4.5 Software Reliability and Recovery Techniques; 1.4.6 Networked Systems Reliability; 1.4.7 Reliability Optimization; 1.4.8 Appendices; General References; References; Problems; 2 Coding Techniques; 2.1 Introduction; 2.2 Basic Principles; 2.2.1 Code Distance; 2.2.2 Check-Bit Generation and Error Detection; 2.3 Parity-Bit Codes; 2.3.1 Applications; 2.3.2 Use of Exclusive OR Gates; 2.3.3 Reduction in Undetected Errors. - 2.3.4 Effect of Coder-Decoder Failures2.4 Hamming Codes; 2.4.1 Introduction; 2.4.2 Error-Detection and -Correction Capabilities; 2.4.3 The Hamming SECSED Code; 2.4.4 The Hamming SECDED Code; 2.4.5 Reduction in Undetected Errors; 2.4.6 Effect of Coder-Decoder Failures; 2.4.7 How Coder-Decoder Failures Effect SECSED Codes; 2.5 Error-Detection and Retransmission Codes; 2.5.1 Introduction; 2.5.2 Reliability of a SECSED Code; 2.5.3 Reliability of a Retransmitted Code; 2.6 Burst Error-Correction Codes; 2.6.1 Introduction; 2.6.2 Error Detection; 2.6.3 Error Correction; 2.7 Reed-Solomon Codes. - 2.7.1 Introduction2.7.2 Block Structure; 2.7.3 Interleaving; 2.7.4 Improvement from the RS Code; 2.7.5 Effect of RS Coder-Decoder Failures; 2.8 Other Codes; References; Problems; 3 Redundancy, Spares, and Repairs; 3.1 Introduction; 3.2 Apportionment; 3.3 System Versus Component Redundancy; 3.4 Approximate Reliability Functions; 3.4.1 Exponential Expansions; 3.4.2 System Hazard Function; 3.4.3 Mean Time to Failure; 3.5 Parallel Redundancy; 3.5.1 Independent Failures; 3.5.2 Dependent and Common Mode Effects; 3.6 An r-out-of-n Structure; 3.7 Standby Systems; 3.7.1 Introduction. - 3.10.3 Clusters. - 3.7.2 Success Probabilities for a Standby System3.7.3 Comparison of Parallel and Standby Systems; 3.8 Repairable Systems; 3.8.1 Introduction; 3.8.2 Reliability of a Two-Element System with Repair; 3.8.3 MTTF for Various Systems with Repair; 3.8.4 The Effect of Coverage on System Reliability; 3.8.5 Availability Models; 3.9 RAID Systems Reliability; 3.9.1 Introduction; 3.9.2 RAID Level 0; 3.9.3 RAID Level 1; 3.9.4 RAID Level 2; 3.9.5 RAID Levels 3, 4, and 5; 3.9.6 RAID Level 6; 3.10 Typical Commercial Fault-Tolerant Systems: Tandem and Stratus; 3.10.1 Tandem Systems; 3.10.2 Stratus Systems. - With computers becoming embedded as controllers in everything from network servers to the routing of subway schedules to NASA missions, there is a critical need to ensure that systems continue to function even when a component fails. In this book, bestselling author Martin Shooman draws on his expertise in reliability engineering and software engineering to provide a complete and authoritative look at fault tolerant computing. He clearly explains all fundamentals, including how to use redundant elements in system design to ensure the reliability of computer systems and networks.Market: Sys
Emner
Sjanger
Dewey
ISBN
0471293423

Bibliotek som har denne