Keynote Speaker

e_DSCF8311

 

Rivalino Matias

 

Title:

Facing the Complexity of Detecting Software Aging Effects

 

Abstract:

A core concept towards understanding the software aging phenomenon is the notion of aging effects, given that they are the concrete manifestation of the underlying aging mechanisms in software systems. Hence, detecting aging effects is central to studies in this area. Note that before planning the use of a rejuvenation mechanism, it is needed to know a priori the type and the dynamics of the aging effects being targeted. For example, in software systems suffering from non-volatile aging effects, the restart or reboot does not offer concrete benefits. As Bill Hewlett once said, “You cannot manage what you cannot measure”; and I would append to it “correctly”. Besides the central importance of aging detection techniques, there were no many advances in this field. A possible explanation for this may be the high complexity in detecting software aging effects, given the numerous intrinsic and extrinsic factors involved, which make this detection process a challenging task. As a result, empirical studies in software aging and rejuvenation have been adopted aging detection approaches that are susceptible to a considerable number of false-positive and false-negative results, which risk compromise their results. In this talk, I aim at highlighting relevant aspects to be considered when facing the complexity of detecting software aging effects. Practical examples are explored along the talk to illustrate the important points.

 

Short Biography:

Rivalino Matias, Jr. is currently a tenured associate professor in the Computing School at the Federal University of Uberlandia, Brazil. In the last twenty years, he has been studying the phenomenon of software aging and rejuvenation. His first experience with software aging was in 2000, working in the telecom industry, troubleshooting reliability and performance degradation problems in a carrier-grade WAP gateway platform. Since then, he has been dedicating to understand and characterize the phenomenon of software aging from both experimental and theoretical viewpoints. In 2008, he was with the Department of Electrical and Computer Engineering at Duke University, Durham, NC, working as a research associate under the supervision of Dr. Kishor Trivedi. Dr. Matias has served as a member of the steering and organizing committees in several high-rank international conferences, as well as a reviewer of many prestigious international journals and conferences. He is a founding member of the WoSAR community. His research interests include applied and theoretical software aging and rejuvenation, and dependability of operating systems.

 

e_DSCF8311

 

Prof. Kishor Trivedi

 

Title:

Software Aging and Rejuvenation: A Genesis

 

Abstract:

The Handbook on Software Aging and Rejuvenation, Chapter 1 presents and overview of the genesis of software aging and rejuvenation research. Empirical software engineering research at AT&T Bell Labs in the early 90’s discovered that software performance degradation and increase in software failure rate could be attributed to a novel phenomenon called software aging. The first paper on the topic was published by researchers from AT&T Bell Labs in FTCS 1995. This paper was based on extensive prior research carried out at AT&T Bell labs. This paper has received the prestigious Jean-Claude Laprie award. Subsequently, considerable research on the topic was carried out at Duke University and Frederico II University of Naples, among others. In this extended abstract, we summarize the most important contributions to software aging and rejuvenation that were made by researchers at AT&T Bell Labs, Duke University, and, Frederico II University of Naples.

 

The seminal paper was brought to Prof. Kishor Trivedi attention in the summer of 1995. At first, Duke researchers relaxed the homogeneous continuous-time Markov chain assumption used for optimal scheduling in the seminal paper. An example of using the relaxation of the homogeneous continuous-time Markov chain assumption is provided in the paper published in ISSRE 1995. This paper is selected as one of the highlights of the thirty-year celebration of the ISSRE conference. In another direction to extend the model, Prof. Kishor Trivedi and his students considered software completion times, while checkpointing and rejuvenation are active. A Markov Decision Process (MDP) was used to model the performance degradation aspect of software aging . Queueing models with Poisson job arrivals, age-dependent (degrading) service rate, and age- dependent (increasing) failure rate were initially used to compute optimal scheduling in and extended in several other papers. The original 4-state CTMC model was extended into a semi Markov model where all transition times are allowed to be generally distributed. An online bucket algorithm was introduced for software aging detection based on customer-affected metrics, and to recommend when to activate the software rejuvenation routine. This paper received the best paper award at ACM WOSP 2005 conference.

 

The DSN 2020 Test of Time Award, was presented to the DSN 2010 paper by Grottke, Nikora and Trivedi. This was work that extracted data from problems reports from operational software so as to classify of software bugs into Bohrbugs, Mandelbugs and aging-related bugs together with their mitigations. The failure Data analysis research was conducted jointly with JPL and was based on problem reports of NASA satellite on-board software failure data for the purpose of bug classification, following previous work.

 

Yet another direction followed at Duke (and jointly with Lucent Bell Labs researchers) was to monitor and collect resource usage data from operational systems to validate the software aging phenomena. Several papers were published based on real data collected. One of these papers published in ISSRE 1998 is selected as one of the highlights of the thirty-year celebration of the ISSRE conference

 

Steve Hunter of IBM, approached Prof. Kishor Trivedi to consider modeling software aging and rejuvenation phenomena for a cluster, which led to the implementation of rejuvenation in IBM’s X-series.

 

The DESSERT1 group at the Federico II University of Naples has been working on software aging and rejuvenation research for over 15 years. The research focus has been on empirical analysis of software aging using real-world measurements of Linux Kernel, and, the Java Virtual Machine.

 

Research on the static features of the software, such as those that can be obtained from its source code.

 

Short Biography:

Kishor S. Trivedi holds the Hudson Chair in the Department of Electrical and Computer Engineering at Duke University, Durham, NC. He has been on the Duke faculty since 1975. He is the author of a well known text entitled, Probability and Statistics with Reliability, Queuing and Computer Science Applications, published by Prentice-Hall; a thoroughly revised second edition (including its Indian edition) of this book has been published by John Wiley. He has also published two other books entitled, Performance and Reliability Analysis of

Computer Systems, published by Kluwer Academic Publishers, and Queueing Networks and Markov Chains, John Wiley.

 

He is a Fellow of the Institute of Electrical and Electronics Engineers. He is a Golden Core Member of the IEEE Computer Society. He has published over 420 articles and has supervised 42 Ph.D. dissertations. He is on the editorial boards of IEEE Transactions on Dependable and Secure Computing, Journal of Risk and Reliability, International Journal of Performability Engineering, and International Journal of Quality and Safety Engineering.

 

His research interests are in reliability, availability, performance, performability and survivability modeling of computer and communication systems. He works closely with industry in carrying out reliability/availability analysis, providing short courses on reliability, availability, performability modeling and in the development and dissemination of software packages such as SHARPE and SPNP.