System reliability is a fundamental requirement of Cyber-Physical System, i.e., a system featuring a tight combination of, and coordination between, the systems computational and physical elements. Cyber-physical system includes systems ranging from the critical infrastructure such as power grid and transportation system to the health and biomedical devices. An unreliable system often leads to disruption of service, financial cost and even loss of human life. In this work, we aim to improve system reliability for cyber-physical systems that meet following criteria: processing large amount of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and accountability requirement for safety critical systems. The reason that we limit the system scope to this type of cyber-physical system is that this type of cyber-physical systems are important and becoming more prevalent.
To improve system reliability for this type of cyber-physical systems, we employ a novel system evaluation approach named automated online evaluation. It works in parallel with the cyber- physical system to conduct automated evaluation at the multiple stages along the workflow of the system continuously and provide operator-in-the-loop feedback on reliability improvement. It is an approach whereby data from cyber-physical system is evaluated. For example, abnormal input and output data can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and higher system reliability. To implement the approach, we design a system architecture named ARIS (Autonomic Reliability Improvement System).
One technique used by the approach is data quality analysis using computational intelligence that applies computational intelligence in evaluating data quality in some automated and efficient way to ensure data quality and make sure the running system to perform as expected reliably. The computational intelligence is enabled by machine learning, data mining, statistical and probabilistic analysis, and other intelligent techniques. In a cyber-physical system, the data collected from the system, e.g., software bug reports, system status logs and error reports, are stored in some databases. In our approach, these data are analyzed via data mining and other intelligent techniques so that useful information on system reliability including erroneous data and abnormal system state can be concluded. These reliability related information are directed to operators so that proper actions can be taken, sometimes proactively based on the predictive results, to ensure the proper and reliable execution of the system.
Another technique used by the approach is self-tuning that automatically self-manages and self-configures the evaluation system to ensure it adapts itself based on the changes in the system and feedback from the operator. The self-tuning adapts the evaluation system to ensure its proper functioning, which leads to a more robust evaluation system and improved system reliability.
Faculty: Prof. Gail Kaiser
PhD Candidate: Leon Wu
- Leon Wu, Gail Kaiser, David Solomon, Rebecca Winter, Albert Boulanger, and Roger Anderson. Improving Efficiency and Reliability of Building Systems Using Machine Learning and Automated Online Evaluation. In the Eighth Annual IEEE Long Island Systems, Applications and Technology Conference (LISAT), May 2012.
- Rebecca Winter, David Solomon, Albert Boulanger, Leon Wu, and Roger Anderson. Using Support Vector Machine to Forecast Energy Usage of a Manhattan Skyscraper. In New York Academy of Science Sixth Annual Machine Learning Symposium, New York, NY, USA, October 2011.
- Leon Wu, Gail Kaiser, Cynthia Rudin, and Roger Anderson. Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid. In Proceedings of the ACM SIGKDD 2011 Workshop on Data Mining for Service and Maintenance, August 2011.
- Leon Wu and Gail Kaiser. Constructing Subtle Concurrency Bugs Using Synchronization-Centric Second-Order Mutation Operators. In Proceedings of the 23th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2011.
- Leon Wu, Boyi Xie, Gail Kaiser, and Rebecca Passonneau. BugMiner: Software Reliability Analysis Via Data Mining of Bug Reports. In Proceedings of the 23th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2011.
- Leon Wu, Gail Kaiser, Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Haimonti Dutta, and Manoj Pooleery. Evaluating Machine Learning for Improving Power Grid Reliability. In ICML 2011 Workshop on Machine Learning for Global Challenges, July 2011.
- Leon Wu, Timothy Teräväinen, Gail Kaiser, Roger Anderson, Albert Boulanger, and Cynthia Rudin. Estimation of System Reliability Using a Semiparametric Model. In Proceedings of the IEEE EnergyTech 2011 (EnergyTech), May 2011.
- Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Maggie Chow, Haimonti Dutta, Phil Gross, Bert Huang, Steve Ierome, Delfina Isaac, Artie Kressner, Rebecca Passonneau, Axinia Radeva, and Leon Wu. Machine Learning for the New York City Power Grid. IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2011.
Available student project positions:
We are not currently actively seeking new students for this project.