Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

Publication Home Page
April-June 2004 (Vol. 1, No. 2)   pp. 109-127
Reflections on Industry Trends and Experimental Research in Dependability

Full Article Text: View linked HTML of full textDownload PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TDSC.2004.20
Send link to a friend

Abstract
Experimental research in dependability has evolved over the past 30 years accompanied by dramatic changes in the computing industry. To understand the magnitude and nature of this evolution, this paper analyzes industrial trends, namely: 1) shifting error sources, 2) explosive complexity, and 3) global volume. Under each of these trends, the paper explores research technologies that are applicable either to the finished product or artifact, and the processes that are used to produce products. The study gives a framework to not only reflect on the research of the past, but also project the needs of the future.
References
[1] A. Amendola et al., “Experimental Evaluation of Computer-Based Railway Control Systems,” Proc. Int'l. Conf. Fault-Tolerant Computing Systems (FTCS-27), 1997.
[2] T. Anderson et al., “Software Fault Tolerance: An Evaluation,” IEEE Trans. Software Eng., vol. 11, no. 12, Dec. 1985.
[3] T. Anderson et al., “Protective Wrapper Development: A Case Study,” Proc. Second Int'l Conf. OTS-Based Software Systems (ICCBSS), 2003.
[4] J. Arlat, Y. Crouzet, and J.-C. Laprie, “Fault Injection for Dependability Validation of Fault-Tolerant Computer Systems,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-19), 1989.
[5] J. Arlat et al., “Fault Injection for Dependability Validation: A Methodology and some Applications,” IEEE Trans. Software Eng., vol. 16, no. 2, 1990.
[6] J. Arlat et al., “Comparison of Physical and Software-Implemented Fault Injection Techniques,” IEEE Trans. Computers, vol. 52, no. 8, Aug. 2003.
[7] T. Aslam, I. Krsul, and E. Spafford, “Use of A Taxonomy of Security Faults,” Proc. 19th NIST-NCSC National Information Systems Security Conf., 1996.
[8] A. Avizienis, “Design of Fault-Tolerant Computers,” Proc. AFIPS Fall Joint Computer Conf., vol. 31, 1967.
[9] A. Avizienis, “The N-Version Approach to Fault-Tolerant Software,” IEEE Trans. Software Eng., vol. 11, no. 12, Dec. 1985.
[10] A. Avizienis, “Toward Systematic Design of Fault-Tolerant Systems,” Computer, vol. 30, no. 4, 1997.
[11] A. Avizienis, J.-C. Laprie, and B. Randell, “Fundamental Concepts of Dependability,” Proc. Third Information Survivability Workshop, 2000.
[12] N. Bhatti, A. Bouch, and A. Kuchinsky, “Integrating User-Perceived Quality into Web Server Design,” Proc. Ninth Int'l WWW Conf., 2000.
[13] B. Bisbey II and D. Hollingsworth, “Protection Analysis Project Final Report,” Technical Report ISI/RR-78-13, DTIC AD A056816, USC/Information Sciences Inst., 1978.
[14] J. Bowen and M. Hinchey, “Seven More Myths of Formal Methods,” IEEE Software, vol. 12, no. 4, 1995.
[15] S. Brilliant, J. Knight, and N. Leveson, “Analysis of Faults in an N-Version Software Experiment,” IEEE Trans. Software Eng., vol. 16, no. 2, 1990.
[16] M. Butcher, H. Munro, and T. Kratschmer, “Improving Software Testing via ODC: Three Case Studies,” IBM Systems J., vol. 41, no. 1, 2002.
[17] S. Butner and R. Iyer, “A Statistical Study of Reliability and System Load at SLAC,” Proc. Int'l. Symp. Fault-Tolerant Computing (FTCS-10), 1980.
[18] S. Card, G. Robertson, and J. Mackinlay, “The Information Visualizer: An Information Workspace,” Proc. ACM CHI '91 Conf., 1991.
[19] J. Carreira, H. Madeira, and J.G. Silva, “Xception: A Technique for the Evaluation of Dependability in Modern Computers,” IEEE Trans. Software Eng., vol. 24, no. 2 1998.
[20] X. Castillo and D. Siewiorek, “A Performance-Reliability Model for Computing Systems,” Proc. Int'l. Symp. Fault-Tolerant Computing (FTCS-10), 1980.
[21] X. Castillo and D. Siewiorek, “Workload, Performance, and Reliability of Digital Computing Systems,” Proc. Int'l. Symp. Fault-Tolerant Computing (FTCS-11), 1981.
[22] X. Castillo, S. McConnel, and D. Siewiorek, “Derivation and Calibration of A Transient Error Reliability Model,” IEEE Trans. Computers, vol. 31, no. 7 1982.
[23] S. Chen et al., “Modeling and Evaluating the Security Threats of Transient Errors in Firewall Software,” Int'l J. Performance Evaluation, vol. 56, nos. 1-4, 2004.
[24] S. Chen et al., “A Data-Driven Finite State Machine Model for Analyzing Security Vulnerabilities,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '03), 2003.
[25] R. Chillarege and N. Bowen, “Understanding Large System Failures— A Fault Injection Experiment,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-19), 1989.
[26] R. Chillarege, W. Kao, and R. Condit, “Defect Type and its Impact on the Growth Curve,” Proc. 13th Int'l Conf. Software Eng., 1991.
[27] R. Chillarege et al., “Orthogonal Defect Classification— A Concept for In-Process Measurements,” IEEE Trans. Software Eng., vol. 18, no. 11, 1992.
[28] R. Chillarege, “ODC for Process Management, Analysis, and Control,” Proc. Fourth Int'l Conf. Software Quality, 1994.
[29] R. Chillarege et al., “Measurement of Failure Rate in Widely Distributed Software,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-25), 1995.
[30] R. Chillarege, “What is Software Failure?” IEEE Trans. Reliability, vol. 45, no. 3, 1996.
[31] R. Chillarege, “The Marriage of Business Dynamics and Software,” IEEE Software, vol. 19, no. 6, 2002.
[32] E. Clarke and E. Emerson, “Synthesis of Synchronization Skeletons for Branching Time Temporal Logic,” Logic of Programs: Workshop, 1981.
[33] D. Cohen et al., “The AETG System: An Approach to Testing Based on Combinatorial Design,” IEEE Trans. Software Eng., vol. 23, no. 7, 1997.
[34] C. Constantinescu, “Validation of the Fault/Error Handling Mechanisms of the Teraflops Supercomputer,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-28), 1998.
[35] J. DeVale and P. Koopman, “Robust Software— No More Excuses,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '02), 2002.
[36] L. Dorron and R. Chillarege, “Early Warning of Failures through Alarm Analysis— A Case Study in Telcom Voice Mail Systems,” Proc. Int'l Symp. Software Reliability Eng., 2003.
[37] J. Duran and S. Ntafos, “An Evaluation of Random Testing,” IEEE Trans. Software Eng., vol. 10, no. 4, 1984.
[38] F. Faccio et al., “Single Event Effects in Static and Dynamic Registers in a 0:25um CMOS Technology,” IEEE Trans. Nuclear Science, vol. 46, no. 6, 1999.
[39] A. Goel, “Software Reliability Models: Assumptions, Limitations and Applicability,” IEEE Trans. Software Eng., vol. 11, no. 12, 1985.
[40] K. Goswami, R. Iyer, and L. Young, “DEPEND: A Simulation-Based Environment for System Level Dependability Analysis,” IEEE Trans. Computers, vol. 46, no. 1, 1997.
[41] J. Gray, “A Census of Tandem System Availability between 1985 and 1990,” IEEE Trans. Reliability, vol. 39, no. 4, 1990.
[42] W. Gu, Z. Kalbarczyk, and R. Iyer, “Error Sensitivity of the Linux Kernel Executing on PowerPC G4 and Pentium 4 Processors,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '04), 2004.
[43] U. Gunneflo, J. Karlsson, and J. Torin, “Evaluation of Error Detection Schemes Using Fault Injection by Heavy-Ion Radiation,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-19), 1989.
[44] A. Hall, “Seven Myths of Formal Methods,” IEEE Software, vol. 7, no. 5, 1990.
[45] R. Hamlet, “Special Section on Software Testing,” Comm. ACM, vol. 31, no. 6, 1988.
[46] D. Harel and M. Politi, Modeling Reactive Systems with Statecharts: The STATE-MATE Approach. McGraw-Hill, 1998.
[47] C. Hennebert and G. Guiho, “SACEM: A Fault Tolerant System for Train Speed Control,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-23), 1993.
[48] G. Holzmann, “The Model Checker SPIN,” IEEE Trans. Software Eng., vol. 23, no. 5, 1997.
[49] The Honeynet Project, Know Your Enemy: Revealing the Security Tools, Tactics, and Motives of the Blackhat Community. Addison-Wesley, 2002.
[50] M. Howard and D. LeBlanc, Writing Secure Code. Microsoft Press, 2001.
[51] W. Howden, Functional Program Testing and Analysis. McGraw-Hill, 1987.
[52] M. Hsiao et al., “Reliability, Availability, and Serviceability of IBM Computer Systems: A Quarter Century of Progress,” IBM J. Research and Development, vol. 25, no. 5, 1981.
[53] R. Iyer and D. Rossetti, “A Statistical Load Dependency Model for CPU Errors at SLAC,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-12), 1982.
[54] R. Iyer and D. Rossetti, “Effect of System Workload on Operating System Reliability: A Study on the IBM 3081,” IEEE Trans. Software Eng., vol. 11, no. 12, Dec. 1985.
[55] R. Iyer and D. Rossetti, “A Measurement-Based Model for Workload Dependency of CPU Errors,” IEEE Trans. Computers, vol. 35, no. 6, June 1986.
[56] R. Iyer, L. Young, and K. Iyer, “Automatic Recognition of Intermittent Failures: An Experimental Study of Field Data,” IEEE Trans. Computers, vol. 39, no. 4, Apr. 1990.
[57] P. Jalote and B. Murphy, “Reliability Growth in Software Products,” Proc. Int'l Symp. Software Reliability Eng., 2004.
[58] E. Jenn et al., “Fault Injection into VHDL Models: The MEFISTO Tool,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-24), 1994.
[59] M. Kalyanakrishnam, R.K. Iyer, and J. Patel, “Reliability of Internet Hosts: A Case Study from End User's Perspective,” Proc. Int'l Conf. Computer Comm. and Networks, 1996.
[60] M. Kalyanakrishnam, Z. Kalbarczyk, and R. Iyer, “Failure Data Analysis of LAN of Windows NT Based Computers,” Proc. 18th Symp. Reliable and Distributed Systems (SRDS '99), 1999.
[61] G. Kanawati, N. Kanawati, and J. Abraham, “FERRARI: A Tool for the Validation of System Dependability Properties,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-22), 1992.
[62] K. Kanoun et al., “SoRel: A Tool for Reliability Growth Analysis and Prediction From Statistical Failure Data,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-23), 1993.
[63] H. Kantz and C. Koza, “The ELEKTRA Railway Signaling System: Field Experience with an Actively Replicated System with Diversity,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-25), 1995.
[64] W. Kao, R.K. Iyer, and D. Tang, “FINE: A Fault Injection and Monitoring Environment for Tracing the Unix System Behavior under Faults,” IEEE Trans. Software Eng., vol. 19, no. 11, Nov. 1993.
[65] P. Koopman and J. DeVale, “The Exception Handling Effectiveness of POSIX Operating Systems,” IEEE Trans. Software Eng., vol. 26, no. 9, 2000.
[66] J. Lala, “Fault Detection, Isolation, and Reconfiguration in FTMP: Methods and Experimen-tal Results,” Proc. Fifth AIAA/IEEE Digital Avionics Systems Conf. (DASC), 1983.
[67] C. Landwehr et al., “A Taxonomy of Computer Program Security Flaws, with Examples,” ACM Computing Surveys, vol. 26, no. 3, 1994.
[68] J.-C. Laprie et al., “Definition and Analysis of Hardware-and-Software Fault-Tolerant Ar-chitectures,” Computer, vol. 23, no. 7, July 1990.
[69] J.-C. Laprie et al., “Dependability: Basic Concepts and Terminology,” Dependable Computing and Fault-Tolerant Systems, 1992.
[70] I. Lee and R. Iyer, “Faults, Symptoms, and Software Fault Tolerance in the Tandem GUARDIAN90 Operating Systems,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-23), 1993.
[71] T. Lin and D. Siewiorek, “Architectural Issues for On-Line Diagnostics in A Distributed Environment,” Proc. Int'l. Conf. Computer Design, 1986.
[72] T. Lin and D. Siewiorek, “Error Log Analysis Statistical Modeling and Heuristic Trend Analysis,” IEEE Trans. Reliability, vol. 39, no. 4, 1990.
[73] U. Lindqvist and E. Jonsson, “How to Systematically Classify Computer Security Intrusions,” Proc. Symp. Security and Privacy, 1997.
[74] R. Maxion, “Distributed Diagnostic Performance Reporting and Analysis,” Proc. Int'l Conf. Computer Design, 1986.
[75] R. Maxion and K. Tan, “Anomaly Detection in Embedded Systems,” IEEE Trans. Computers, vol. 51, no. 2, Feb. 2002.
[76] R. Maxion and R. Olszewski, “Eliminating Exception Handling Errors with Dependability Cases: A Comparative, Empirical Study,” IEEE Trans. Software Eng., vol. 26, no. 9, Sept. 2000.
[77] S. McConnel, D. Siewiorek, and M. Tsao, “The Measurement and Analysis of Transient Errors in Digital Compute Systems,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-9), 1979.
[78] J. McDermott, “R1: A Rule-Based Configurer of Computer Systems,” Artificial Intelligence, vol. 19, no. 2, 1982.
[79] A. Merenda and E. Merenda, “Recovery/Serviceability/System Test Improvements for the IBM ES/9000 520 Based Models,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-22), 1992.
[80] R. Miller, “Response Time in Man-Computer Conversational Transactions,” AFIPS Fall Joint Computer Conf., vol. 33, 1968.
[81] B. Murphy and B. Levidow, “Windows 2000 Dependability,” Microsoft Research Technical Report MSR-TR-2000-56, 2000.
[82] J. Musa, Software Reliability Engineering. McGraw Hill, 1998.
[83] E. Normand, “Single Event Upset at Ground Level,” IEEE Trans. Nuclear Science, vol. 43, 1996.
[84] ODC ODC-511, Web Resources, 2004, www.chillarege.com; www.chillarege.comwww.research.ibm.comsofteng .
[85] D. Patterson et al., “Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies,” CS Technical Report UCB/CSD-02-1175, Univ. of California at Berkeley, 2002.
[86] IEEE Pervasive Computing, special issue on integrated environments, vol. 1, no. 2, 2002.
[87] M. Phadke, Quality Engineering Using Robust Design. Prentice Hall, 1989.
[88] D. Powell, “Failure Mode Assumptions and Assumption Coverage,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-22), 1992.
[89] J. Queille and J. Sifakis, “Specification and Verification of Concurrent Systems in Cesar,” Proc. Fifth Symp. Programming, 1981.
[90] M. Rodriguez et al., “MAFALDA: Microkernel Assessment by Fault Injection and Design Aid,” Proc. Third European Dependable Computing Conf. (EDCC-3), 1999.
[91] M. Rodriguez, J.-C. Fabre, and J. Arlat, “Wrapping Real-Time Systems from Temporal Logic Specifications,” Proc. Fourth European Dependable Computing Conf. (EDCC-4), 2002.
[92] J. Rushby, “Formal Methods and the Certification of Critical Systems,” Technical Report CSL-93-7, CS Laboratory SRI, 1993.
[93] J. Samson, W. Moreno, and F. Falquez, “A Technique for Automated Validation of Fault Tolerant Designs Using Laser Fault Injection,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-28), 1998.
[94] Z. Segall et al., “FIAT— Fault Injection Based Automated Testing Environment,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-18), 1988.
[95] F. Sellers, M. Hsiao, and L. Bearnson, Error Detecting Logic for Digital Computers. McGraw-Hill, 1968.
[96] Y. Shi, “A Portable, Self Hosting System Dependability Measurement and Prediction Module,” technical report, Electrical and Computer Eng. Dept., Carnegie Mellon Univ., 1999.
[97] P. Shivakumar et al., “Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '02), 2002.
[98] D. Siewiorek et al., “A Case Study of C. mmp, Cm*, and C. vmp,” Proc. IEEE, vol. 66, no. 10, 1978.
[99] J. Silberman, “Robot Orthogonal Defect Classification Towards an In-Process Measurement System for Mobile Robot Development,” technical report, Robotics Inst. Carnegie Mellon Univ., 1998.
[100] C. Simache, M. Kaâniche, and A. Saidane, “Event Log Based Dependability Analysis of Windows NT and 2K Systems,” Pacific Rim Int'l Symp. Dependable Computing (PRDC '02), 2002.
[101] L. Spainhower and T.A. Gregg, “G4: A Fault-Tolerant CMOS Mainframe,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-28), 1998.
[102] L. Spainhower et al., “IBM's ES/9000 Model 982's Fault-Tolerant Design for Consolidation,” IEEE Micro, vol. 14, no. 1, 1994.
[103] L. Spainhower and T. Gregg, “IBM S/390 Parallel Enterprise Server G5 Fault Tolerance: A Historical Perspective,” IBM J. Research and Development, vol. 43, nos. 5/6, 1999.
[104] L. Spitzner, Honeypots: Tracking Hackers. Addison-Wesley, 2003.
[105] A. Steinfeld et al., “An Examination of Remote Access Help Desk Cases,” Technical Report CMU-CS-03-190, CMU-HCII-03-100, School of Computer Science, Carnegie Mellon Univ., 2003.
[106] D. Stott et al., “Dependability Assessment in Distributed Systems with Lightweight Fault Injectors in NFTAPE,” Proc. Fourth Int'l Computer Performance and Dependability Symp., 2000.
[107] M. Sullivan and R. Chillarege, “Software Defects and Their Impact on System Availability— A Study of Field Failures in Operating Systems,” Proc. Int'l Symp. Fault-Tolerant Computing (FTCS-21), 1991.
[108] D. Tang and R. Iyer, “Analysis and Modeling of Correlated Failures in Multicomputer Systems,” IEEE Trans. Computers, vol. 41, no. 5, May 1992.
[109] P. Thévenod-Fosse, H. Waeselynck, and Y. Crouzet, “Software Statistical Testing,” Predictably Dependable Computing Systems, 1995.
[110] A. Tiwari, J. Rushby, and N. Shankar, “Invisible Formal Methods for Embedded Control Systems,” Proc. IEEE, vol. 91, no. 1, 2003.
[111] P. Traverse, “Dependability of Digital Computers on Board Airplanes,” Proc. First Int'l Working Conf. Dependable Computing for Critical Applications, 1989.
[112] T. Tsai et al., “Stress-Based and Path-Based Fault Injection,” IEEE Trans. Computers, vol. 48, no. 11, 1999.
[113] M. Tsao and D. Siewiorek, “Trend Analysis on System Error Files,” Proc. Int'l Symp. Fault Tolerant Computing (FTCS-13), 1983.
[114] T. Vardanega et al., “On the Development of Fault-Tolerant On-Board Control Software and its Evaluation by Fault Injection,” Proc. Int'l. Symp. Fault-Tolerant Computing (FTCS-25), 1995.
[115] K. Wagner and E.J. McCluskey, “Effect of Supply Voltage on Circuit Propagation Delay and Test Application,” Proc. Int'l Conf. Computer-Aided Design, 1985.
[116] E. Weyuker and T. Ostrand, “Theories of Program Testing and the Application of Revealing Subdomains,” IEEE Trans. Software Eng., vol. 6, no. 3, 1980.
[117] A. Wood, “Softare Reliability from the Customer View,” Computer, vol. 36, no. 8, 2003.
[118] J. Xu, Z. Kalbarczyk, and R. Iyer, “Networked Windows NT System Filed Failure Data Analysis,” Proc. Pacific Rim Int'l Symp. Dependable Computing (PRDC '99), 1999.
[119] J. Xu et al., “An Experimental Study of Security Vulnerabilities Caused by Errors,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '01), 2001.
[120] C. Yount and D. Siewiorek, “The Automatic Generation of Instruction-Level Error Mani-festations of Hardware Faults,” IEEE Trans. Computers, vol. 45, no. 8, Aug. 1996.
[121] J. Ziegler et al., “IBM's Experiments in Soft Fails in Computers,” IBM J. Research and Development, vol. 40, no. 1, 1996.
[122] http://www.dependability.org/wg10.4SIGDeB , 2004.
[123] http:/www.securityfocus.com, 2004.
[124] http:/www.cert.org, 2004.
[125] http://securitytracker.com/learnstatistics.html , 2004.
[126] http://www.honeypots.net/honeypotslinks/, 2004.
[127] http:/www.honeynet.org, 2004.
Additional Information
Index Terms-  Experimental research in dependability and security, computing industry trends.

Citation:  Daniel P. Siewiorek, Ram Chillarege, Zbigniew T. Kalbarczyk, "Reflections on Industry Trends and Experimental Research in Dependability," IEEE Transactions on Dependable and Secure Computing, vol. 01,  no. 2,  pp. 109-127,  Apr-Jun,  2004

RSS Feed

Similar Articles

Abstract Contents
Abstract
References
Index Terms
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

PDFs require Adobe Acrobat Reader.

Peer Review Notice

Give us Feedback