Intelligent Systems for proactive Maintenance

Analyzing maintenance log data to predict system failures

Cyber-Physical Systems (CPS) are often very complex and require a tight interaction between hardware and software. As it happens in almost any software systems, also CPS  generate different kinds of logs of the activities performed, including correct operations, warnings, errors, etc. Frequently, the logs generated are specific to the different subsystems and are generated independently. Such logs contains a wealth of information that needs to be extracted and that can be analyzed in different ways to understand how the single subsystem behaves and even retrieve information about the behavior of the overall system. In particular, considering the generated logs, it is possible to:

  1. Analyze the behavior of a single subsystem looking at the data generated by each one in an independent way;
  2. Analyze the overall behavior of the system looking at the correlations among the data generated by the different subsystems

Such data are very useful to understand the behavior of a system and are often used to perform post-mortem analysis when some failures happen. However, such data could also be used to understand in a more comprehensive way how the system behaves through a real-time analysis able to monitor continuously the different subsystems and their interactions. In particular, it is possible to focus on preventing failures through predictive maintenance triggered by specific analysis.

Making predictions about system failures analyzing log files is possible but such predictions are strictly related to some characteristics of such files. In particular, some very important characteristics are: data generation frequency, information details, history.

The data generation frequency needs to be related to the prediction time and the time required to take proper actions. For example, if we need to detect a failure and take proper action in a few minutes, we need to use data generated with a higher frequency (e.g., in the scale of the seconds) and we cannot use data generated with a lower one (e.g., in the scale of the hours). This requirement affects the ability to make predictions and their usefulness to implement proper maintenance actions.

The information details provided need to include proper granularity and meaningful massages. In particular, it is important to get detailed information about errors, warnings, operations performed, status of the system, etc. The specific details required are tightly connected to the specific predictions that are needed. Moreover, the finer the granularity of the information, the higher are the chances of being able to create a proper prediction model.

High quality data history is required to build proper prediction models. However, just having a large dataset is not enough. Historical data need to be representative of the operating environment and include all the possible cases that may happen during  operations. In particular, it is required to have information about the log entries and the actual behavior of the system to create a reliable model of the reality.

The requirements described are just a first step towards the definition of a proper predictive maintenance model but they are essential. Moreover, the proper approaches and algorithms need to be selected based on the specific system and the related operating conditions.

Reference Architecture of the Portuguese Mantis Pilot

Introduction

The MANTIS Steel Bending Machine pilot aims at providing the use case owner – ADIRA – a worldwide remote maintenance service to its customers. The main goal is to improve its services by making available new maintenance capabilities with reduced costs, reduce response time, avoiding rework and allowing for better maintenance activities planning.

To this purpose existing ADIRA’s machines (starting with their high end machine model – the Greenbender) will be augmented with extra sensors, which together with information collected from existing sensors will be sent to the cloud to be analyzed. Results made available by the analysis process will be presented to machine operators or maintainers through a HMI interface.

adira greenbender gb-22040 MANTIS
Adira –  Greenbender GB-22040

A number of partners are involved into the development and testing of the modules, which regard the communication middleware (ISEP, UNINOVA), data  processing  and  analytics activities (INESCISEP), the HMI applications (ISEP), and a stakeholder providing a machine to be enhanced with the MANTIS innovations (ADIRA).

System Architecture

The distributed system being built responds to a reference architecture that is composed by a number of modules, the latter grouped into 4 logical blocks: the Machine under analysis, Data Analysis module, Visualization module, and the Middleware supporting inter-module communications.

architechture of the maintenance system for MANTIS
architecture of the maintenance system for MANTIS

Machine

Data regarding the machine under analysis are collected by means of sensors, which integrate with the machine itself. This logical block consists thus of data sources that will be used for failure detection, prognosis and diagnosis. This set of data sources comprises an ERP (Enterprise Resource Planning) system, data generated by the machine’s Computer Numerical Controller (CNC) and the safety programmable logical controller (PLC).

Middleware

This logical block operates through two basic modules. The first is the MANTIS Embedded PC, which is basically an application that can run on a low cost computer (like a Raspeberry Pi) or directly on the CNC (if powerful enough). This module is responsible for collecting the data from the CNC I/O and transmitting it to the Data Analysis engine for processing and is implemented as a communication API. When based on an external computer, this module also connects to the new wireless MANTIS sensors placed on the machine using Bluetooth Low Energy protocol (BLE). Communications are then supported by the RabbitMQ message oriented middleware, which takes care of proper routing of messages between peers. This middleware handles both AMQP and MQTT protocols to communicate between nodes.

The I/O module is used in order to extract raw information from the machine sensors which is collected by the existing PLC, made available on the Windows-based numerical controller through shared memory and then written to files. Our software collects sensor data from these files, thus completely isolating the MANTIS applications from the numerical controller’s application and from the PLC.

Data Analysis

This logical block takes care of Data Analysis and Prediction, and it exploits three main modules. The first is a set of prediction models used for the detection, prognosis and diagnosis of the machine failures. The second is an API that allows clients to request predictions from the models, and that can respond to different paradigms such as REST or message-queue based. Finally, the third module is a basic ETL subsystem (Extraction, Transformation and Loading) that is responsible for acquiring, preparing and recording the data that will be used for model generation, selection and testing. This last module is also used to process the analytics request data as the same model generation transformations are also required for prediction.

Visualization

This logical block consists of two modules, the human machine interface (HMI) and the Intelligent Maintenance DSS. The HMI is designed to be a web-based mobile application, and to be accessed via the network from any computer or tablet. The HMI is developed to work in two different modes, depending on which kind of user is accessing it. In fact, the HMI is developed to support two user types, the data analyst and the maintenance manager, allowing both of them to analyze the machine’s status, record failure and diagnostics related data. Moreover, the data analysis HMI provides an interface with the data analyst, allowing the consultation and analysis of data and results. On the other hand, the maintenance management HMI allows for consulting predicted events and suggested maintenance actions.

The second module is an Intelligent Maintenance DSS, which uses a Knowledge Base that uses diagnosis, prediction models and the data sent by sensors. On top of this Knowledge Base there is a Rule based Reasoning Engine that includes all the rules that are necessary to deduce new knowledge that helps the maintenance crew to diagnose failures.

Ongoing work

The work performed so far is well advanced and an integration event will occur in the near future where the interconnection between all systems will be tested and validated.

The demonstrator being built, will be evaluated according to the following criteria: prediction model performance (live data sets will be compared to model generation test   sets) and the applications usability (the user should access the required information easily, in order to facilitate failure detection and diagnosis).

Fast prototyping of service robot behavior for a cleaning and tidying task in maintenance

The MANTIS project is concerned with predictive maintenance on the basis of big data streams from large (industrial) operations. At the end of the processing pipe line, planning suggestions for maintenance actions will be the result. Usually, maintenance is performed by human operators.

However, with current developments in machine learning, AI and robotics, it becomes interesting to see what type of ‘corrective actions’ in maintenance could be performed by industrial service robots.

In industrial production lines it is common to observe fairly short times between failure, especially in long chains. Whereas individual components are often designed to function extremely well, for instance under a regime of ‘zero-defect manufacturing’, the performance of the line as a whole may be disappointing. What is more, the actions performed by human operators to solve the problems may be very mundane and simple, such as removing dirt due to fouling or lubricating critical components. With the current advances in robot hardware and software technology, it becomes increasingly attractive to automate such maintenance actions. Whereas maintenance in the form of module- or part replacement are too difficult for current state-of-the-art robotics, cleaning and tidying is definitely possible.

With this application domain in mind, a laboratory setup was designed for quickly developing a robotic maintenance task for the purpose of demonstration by a master student team (Francesco Bidoia, Rik Timmers, Marc Groefsema) under guidance of a PhD student (Amir Shantia). We were able to realize a rapid configuration of our existing mobile robot platform to realize simple cleaning and tidying actions, similar to what is needed in basic industrial maintenance tasks. The demonstration involves speech control, navigational autonomy, work piece approach and dynamic reactivity to three object types, using tool switching. Objects are considered to be either a) untouchable, or b) removable by hand, or to consist, c) of small fragments (cf. ‘dirt’) that needs to be brushed away. In three weeks, a full demonstration could be developed by the student team, using a mobile robot with a single arm that was designed earlier, for Robocup@Home tasks:

The robot in our demonstration uses the light-weight carbon-fiberarm by Kinova (http://www.kinovarobotics.com/), a self-made transport base, standard Kinect sensors (for generating 3D point clouds) and digital cameras for vision. Programming was done using the ROS environment, with a pre-existing code base in C++ and Python.It is evident that by using current commercial existing mobile platforms such as KUKA (http://www.kukarobotics.com/en/products/mobility/KMR_iiwa/), MIR (http://mobile-industrial-robots.com/en/multimedia-2/videos/) and Universal Robots (https://www.universal-robots.com/), a similar, more sturdy industry-level system can be constructed.

Watch the whole demonstration here:

 

Deep learning for predictive maintenance

There are two extreme approaches to predicting failures for predictive maintenance. The white box approach relies on manually constructed physical and mechanical models for predicting the failures. The black box approach, on the other hand, relies on failure prediction models constructed using statistical and machine learning methods based on the data gathered from a running system. The figure below illustrates such data driven failure prediction for a machine monitored by three sensors.

data driven failure prediction
data driven failure prediction

Machine learning algorithms are used to identify failure patterns in the sensor data that precede a machine failure. When such patterns are observed in operation, an alarm can be triggered to take corrective action to prevent or mitigate the eminent failure. For example, failure predictions can be used to optimize the maintenance actions, such as scheduling the service engineers or managing the spare parts storage to reduce the downtime cost.

Automatic feature extraction

An important part of modeling a failure predictor is selecting or constructing the right features, i.e. selecting existing features from the data set, or constructing derivative features, which are most suitable for solving the learning task.

Traditionally, the features are selected manually, relying on the experience of process engineers who understand the physical and mechanical processes in the analyzed system. Unfortunately, manual feature selection suffers from different kinds of bias and is very labor intensive. Moreover, the selected features are specific to a particular learning task, and cannot be easily reused in a different task (e.g. the features which are effective for predicting failures in one production line will not necessarily be effective in a different line).

Deep learning techniques investigated in the MANTIS project offer an alternative to manual feature selection.  It refers to a branch of machine learning based on algorithms which automatically extract abstract features from the raw data that are most suitable for solving a particular learning task. Predictive maintenance can benefit from such automatic feature extraction to reduce effort, cost and delay that are associated with extracting good features.