Limit checking of measured variables in a monitored system is a method frequently used for fault detection. 3E uses it as a first step on its protocol of fault detection and diagnosis to know at which stage of a Photovoltaic plant actions need to be taken before any further deep analysis on the characteristics of the problems. Here, 3E illustrates the methodology used to apply it in their use case.
Photovoltaic (PV) plants are energy conversion systems which convert sunlight into electricity that is fed into the public utility grid. The physical structure and the important process variables of a PV plant measured when monitoring the performance of a photovoltaic (PV) plant are illustrated in Figure 1. The input variables of the process model are: the solar irradiance in the plane of the PV array (GPOA) and the ambient temperature (Tamb). Output variables from the process model point of view are: the PV module temperature (Tmod); the Direct Current voltage(VDC) and current (IDC) at the output of the PV array; the Alternating Current voltage (VAC); the power factor (PF); and the electric AC power to the grid (PAC).
Normalized performance parameters can be derived from the previously mentioned measurements and allow to quantify the energy flow and losses through the PV array per loss type. They are:
LA,V = YA,T – YA
with LA,I, LA,T, LA,V, the conversion losses due to current, temperature and voltage, respectively and Yr, YA,I, YA,T, YA the normalized energy yield from reference yield (based on irradiation from the sun), array yield after current losses, array yield after temperature losses and array yield after all array losses, respectively.
The main variables used for limit checking are solar irradiance in the plane of the PV array (GPOA), ambient temperature (Tamb), PV module temperature (Tmod), DC voltage and current at the output of the PV array (VDC, IDC) and electric AC power to the grid (PAC). The AC voltage (VAC) and power factor (PF) are not used for limit checking.
For checking the operational performance over different energy conversion steps, a performance loss ratio per step is defined. This performance loss ratio is computed for a given time span, e.g., a day up to several months. It is the useful energy lost over the energy conversion step divided by the energy available, i.e. the incoming solar energy on the PV array as represented by the solar irradiance in the plane of the PV array (GPOA); all normalized to standard rating conditions of the PV array. Accordingly, the overall performance of a PV plant is described by the performance ratio (PR), i.e., 100% minus the sum of all performance losses.
In practice, we compare the performance loss ratios from measurements to model-based performance loss ratios and thresholds. The model is fed with measured values of GPOA and Tamb. The model parameters can be set from data sheet parameters of the devices in the PV plant or identified from measurements from the plant in a healthy state. Accordingly, adequate limits can be derived either from tolerances on the data sheet parameters or from choosing percentiles from the healthy plant. Both the model-based performance loss ratios and their limit values vary depending on the PV plant and the weather during the evaluation period.
Figure 2 illustrates this application of limit checking for a PV plant located in Belgium. The current-related array losses (‘Array (current)’) in Figure 2a by far exceed the threshold. During a thorough maintenance action after this problem was detected, several smaller PV module failures were fixed. After maintenance action, all performance loss ratios were back within their expected ranges, yielding a much higher PR of 82.9% (Figure 2b).
This use case studied the analysis of sensor data from a brake press in order to facilitate its maintenance. Brake forming is the process of deforming a sheet of metal along an axis by pressing it between clamps. A single sheet metal may be subject to a sequence of bends resulting in complex metal parts such as electrical lighting posts and metal cabinets.
These machines require very accurate control so as to ensure the required bending precision that is in the order of tens of microns. They have stringent safety requirements that also impose certain restriction on its operation. In addition to this, the production efficiency is also a very important factor in its operation.
In order to ensure production quality under these stringent requirements, it is important to make sure that all of the machines’ components are in perfect working order. The goal of this use case in the MANTIS project is to use a set of sensors to detect failures and then inform the maintenance staff of these events. In this work we used a top of the line Greenbender model to implement and test a system that could accomplish these goals.
A multi-disciplinary team participated in the research and development of this use case. The use case owner is the machine tool manufacturer ADIRA that sells machines worldwide. ADIRA’s main goal is to improve the maintenance services they provide to their customers.
Research and development in the area of communications was jointly done by ISEP and UNINOVA. This included the IoT architecture, sensors, communication’s hardware and infrastructure deployment. Data processing and analytics was performed by INESC and ISEP. INESC focused on root cause analysis (RCA), remaining useful life (RUL) forecasting and anomaly detection. ISEP worked on knowledge based techniques for failure detection by developing and testing a decision support system. In addition to this ISEP also developed a Human Machine Interface (HMI) application that provides access to IoT infrastructure and several MANTIS services, which includes the notification of failures.
JSI and XLAB also provided valuable input and feedback concerning the initial research and design tasks of the communications infrastructure (real time data transmission) and the HMI (usability).
The MANTIS project has provided INESC with the opportunity to research, test and apply machine learning techniques in a real-world setting. Tasks included the detailed study of the machine tools’ processes and components, eliciting requirements and information from the domain experts and evaluating several machine learning algorithms. Due to the many challenges that were faced in identifying, collecting and using sensor data, only anomaly detection is currently being deployed in this use case.
A set of 11 conditions are being continually monitored for anomalies. For each anomaly two thresholds are being used to identify respectively small and large deviations from the expected behavior. Whenever such a deviation is detected, an alert is dispatched to the HMI where the users are notified. These monitoring conditions should allow ADIRA to detect failures in the hydraulic system, numeric controller and several electric components. In addition to this, oil temperature and machine vibrations are also being monitored.
The MANTIS system, which includes INESC’s analytics module, has been deployed as a set of services in the Cloud. Initial tests show good false positive rates. We are now in the process of performing on-line evaluations of the detection rates. We are confident that these results will serve as an important firsts step for ADIRA to enhance its products by using more sophisticated and effective data analytics methods.
Liebherr participates in the MANTIS project as an industrial partner with the division Liebherr hydraulic excavators. As expected, the main expertise of Liebherr consists in developing and optimizing excavators under consideration of different information sources. However, after the delivery of the excavator to the customer, every excavator generates event respectively message data automatically, which are actually mainly used for fault diagnostics but not extensively for further investigations.
This event data logger records among other things basically:
timestamp, when an event occurs
type of event, e.g. info, warning or error
unique message identifier of this event class
In combination with anonymized data concerning service partner and customer the following questions are relevant:
Is there a relation between the message patterns and the corresponding anonymized service partner?
Is there a relation between the message patterns and the anonymized customer?
Analysis approach for clustering
The related analysis was performed by the University of Groningen (RUG) as a research partner within the MANTIS project by considering each excavator as a stochastic message generator. In the context of preprocessing, the different messages were first counted per excavator and afterwards normalized with the total amount of occurrence per unique message identifier.
Based on the computed message probabilities per machine a k-means clustering was performed. To overcome initialization influences the clustering was performed 100 times with random initialization. The relationship of the cluster assignment of each excavator with the corresponding service partner or customer for each ‘k’ was subsequently examined with the chi-square test. The average estimate of the significance of the 100 model estimations of each ‘k’ then represented the quality function.
Results of cluster analysis
As can be seen in figure 1, there is no tendency for a relationship between the service partner and the messages per excavator. The average significance level is obviously higher than 0.05 and all of the single levels have nearly the same magnitude.
In contrast to figure 1, figure 2 shows a clear minimum at k=7, indicating that for this number of groups, it is likely that the distribution of machines over customers is not likely to be random. Although the p_signif – value is with 0.0588 slightly above the significance level of 0.05, the magnitude at k=7 is obviously lower than at other k-values.
In order to explain the minimum at k = 7, Liebherr decoded the anonymized customers and tried to find manually a description of the clusters. The cumbersome work did actually not yield to the expected result, namely the detection of short cluster descriptions, but rather to the recognition of customer data mismatching.
In summary the carried out analysis pointed out, that with the skillful usage of analysis algorithms superficial unmanageable data can disclose insights. But one of the basic requirements for later usage of the results is the proper preparation of data.
When analysing sensor data, you are typically confronted with different challenges relating to data quality. Here, we show you how these challenges can be dealt with and how we derive some initial insights from cleaned data via exploration techniques such as clustering.
Nowadays, especially with the advent of the Internet of Things (IoT), large quantities of sensor data are collected. Small sensors can be easily installed, on multipurpose industrial vehicles for instance, in order to measure a vast range of parameters. The collected data can serve many purposes, e.g. to predict system maintenance. However, when analysing it, you are typically confronted with different challenges relating to data quality, e.g. unrealistic or missing values, outliers, correlations and other typical and a-typical obstacles. The aim of this article is to show how these challenges can be dealt with and how we derive some initial insights from cleaned data via exploration techniques such as clustering.
Within the MANTIS project, Sirris is developing a general methodology that can be used to explore sensor data from a fleet of industrial assets. The main goal of the methodology is to profile asset usages, i.e. define separate groups of usages that share common characteristics. This can help experts to identify potential problems, which are not visually observable, when the resulting profiles are compared with the expected behaviour of the assets and when anomalies are detected.
In this article, we will describe the methodology of asset usage profiling for proactive maintenance prediction. The data used in this article is confidential and anonymised; we therefore cannot describe it in detail. It mainly consists of duration and resource consumption as well as a range of parameters measured via different sensors. For our analysis, we used Jupyter Notebook with appropriate libraries such as pandas, scipy and scikit-learn.
Sometimes data can be polluted, as it is collected from different sources and can contain duplicates, wrong values, empties and outliers, which should all be considered carefully. Therefore, the first natural step is to conduct an initial exploration of the data and to prepare a single reference dataset for advanced analysis, by cleaning the data, by means of visual and statistical methods, then by selecting the right attributes you wish to work with further.
In our example dataset, we find negative or zero-resource consumption, a situation that is obviously impossible, as shown in Figure 1. In our case, since there are few outliers of this type, we simply remove them from the dataset.
Figure 1 Zero or negative consumption
Another possible example is that of an erroneous date in the data. For example, dates may be too old compared to the rest of your dataset; future dates can even exist. Your decision to maintain, fix or remove wrong instances can depend on many factors, such as how big your dataset is, whether an erroneous date is very important at the current stage, etc. In our case, we maintain these instances since, at this moment, the date is not important for analysis and the percentage of this subset is very low.
Outliers are extreme values that deviate sufficiently from other observations and also need to be dealt with carefully. They can be detected visually and using statistical means. Sometimes we can simply remove them, sometimes we want to analyse them thoroughly. Visualising the data directly reveals some potential outliers; refer to the point in the upper right-hand corner in Figure 2. In our case, such high values for duration and consumption are impossible, as shown in Figure 3. Since it is the first record for this type of asset, it may have been entered manually for test purposes; we consequently choose to remove it.
Figure 2 Visual check for outliers
Figure 3 Impossible data
In Figure 4, we can see a positive linear correlation between consumption and duration, which is to be expected, although we still may find some outliers using the 3-sigma rule. This rule states that, for the normal distribution, approximately 99.7 percent of observations lie within 3 standard deviations of the mean. Then, based on Chebyshev’s Inequality, even in the case of non-normally distributed data, at least 88.8 percent of cases fall within 3-sigma intervals. Thus, we consider observations beyond 3-sigmas as outliers.
Figure 4 Data after cleaning
In Figure 5, we see that our data is quite normal, centred around 0, most values lying between -2 and 2. This means that the 3-sigma rule will show us more accurate results. You must normalise your data before applying this rule.
Figure 5 Distribution of normalised consumption/s
Results are shown in Figure 6. The reason for such a significant deviation from the average in consumption and duration of certain usages is to be discussed with a domain expert. One instance with very low consumption for a long duration raises particular questions (Figure 7).
Figure 6 3-sigma rule applied to normalised data
Figure 7 Very low consumption for its duration
Advanced data exploration
As previously stated, we are looking to profile asset usages in order to identify abnormal behaviour and therefore, along with duration and resource consumption, we also need to investigate the operational sensor data for each asset. This requires us to define groups of usages that share common characteristics; however, before doing so, we need to select a representative subset of data with the right sensors.
From the preliminary analysis, we observed that the number of sensors can differ between the assets and even between usages for the same asset. Therefore, for later modelling we need to exclusively select usages which always contain the same sensors, i.e. training a model requires vectors of the same length. To achieve this, we can use the following approach, as illustrated in Figure 8.
Figure 8 Selecting sensors
Each asset has a number of sensors that can differ from usage to usage, i.e. some modules can be removed or installed on the asset. Thus, we need to check the presence of these sensors across the whole dataset. Then, we select all usages with sensors that are present above a certain percentage, e.g. 95 percent, in the whole dataset. Let’s assume our dataset contains 17 sensors that are present in 95 percent of all usages. We select these sensors and discard those with lower presence percentages. This way, we create a vector of sensors of length 17. Since we decided to include sensors if they are 95 percent present, a limited number of usages may still be selected although they do not contain some of the selected sensors, i.e. you introduce gaps which are marked in yellow in the figure. To fix these gaps, you can either discard these usages or attribute values for missing sensors. Attributing can be complex, as you need to know what these sensors mean and how they are configured. In our case, these details are anonymised and these usages are consequently discarded. You may need to lower your presence percentage criteria in order to keep a sufficiently representative dataset for further analysis.
After the optimal subset is selected, we check the correlation of the remaining sensors. We do this because we want to remove redundant information and to simplify and speed up our calculations. Plotting a heatmap is a good way of visualising correlation. We do this for the remaining sensors as shown in Figure 9.
Figure 9 Sensor correlation heatmap
In our case, we have 17 sensors from which we select only 7 uncorrelated sensors and plot a scatter matrix, a second visualisation technique which allows us to view more details on the data. Refer to Figure 10.
Figure 10 Scatterplot matrix of uncorrelated sensors
Based on the selected sensors, we now try to characterise different usages for each asset, i.e. we can group usages across the assets based on their sensor values and, in this way, derive a profile for each group. To do this, we first apply hierarchical clustering to group the usages and plot the resulting dendrogram. Hierarchical clustering helps to identify the inner structure of the data and the dendrogram is a binary tree representation of the clustering result. Refer to Figure 11.
Figure 11 Dendrogram
On this graph, below distance 2 we see smaller clusters that are grouping ever closer to each other. Hence, we decide to split the data into 5 different clusters. You can also use silhouette analysis for selecting the best number of clusters.
In order to interpret the clustering, we also want to visualize them, but 7 sensors mean 7 dimensions and because we can’t plot in multidimensional space or it is too complex, we apply Principal Component Analysis or simply PCA in order to reduce the number of dimensions to 2. This allows us to visualize the results of clustering, which is shown on Figure 12. Good clustering means that clusters should be more or less well separated, i.e. similar colours are close to one another or not mixed too much with other colours, and this is what we also see in the figure.
Figure 12 PCA plot
After the clustering is complete, we can characterise usages. This can be done using different strategies. The simple method consists in taking the mean of the sensor values for each cluster (i.e. we calculate a centroid) to define a representative usage.
The last step involves validating the clusters. We can cross-check clustering with the consumption/duration of usages. For instance, we may expect all outliers to fall within one specific cluster, or expect some other more or less obvious patterns, hence rendering our clusters meaningful. In Figure 13 below, we can observe that the 5 clusters, i.e. 5 types of usages, correspond, to an extent but not entirely, to consumption/duration behaviour. We can see purple spots at the bottom and green spots at the top.
Figure 13 Relationship between clusters and consumption/duration
At this stage, some interesting outliers were detected in consumption/duration relationships, which can be stressed with the objectives the assets were used for. We have found clusters that represent typical usages according to data. Result validation can be improved by integrating additional data, such as maintenance data, into analysis. Furthermore, results can be validated and confidently concluded by the domain experts from Ilias Solutions, the industrial partner we are supporting for their data exploitation.
Goizper S. Coop’s products are mechanical components (clutch brakes, gear boxes, indexing units…) installed within different kind of productive machines. These machines are designed to produce continuously and unplanned downtimes generate high costs. These components are the key part of some of the mentioned productive machines and the relevant component’s health influences directly within the machine status.
Breakdown of Components
Furthermore, if one of these components fails, it takes a long time while a new one is sent to the customer’s facilities, removed the old one and set up the new one. In these cases, the production asset maintenance means lot of expenses for customers and suppliers.
MANTIS for predictive maintenance
MANTIS platform provides an online and future view of these components’ health. Smart sensors installed at the mechanical component are connected to Monitoring and Alerting, which is performed automatically, within the smart-G box located next to the mechanical component. Then, this Big Data is processed in the Cloud and through different Maintenance data analytics the status and future trend of the component health is obtained as an output.
Obviously, the introduction of this Cyber Physical System will not eliminate all machine breakdowns, but it will help in order to reduce considerably machine unplanned downtimes, so that the customer and supplier will be able to plan their maintenance tasks and reduce these kind of stops.
Within the MANTIS ECSEL project, Goizper has collaborated close to one of its customers, Fagor Arrasate, trying to improve the real inconveniences and reduce expenses that unplanned downtimes cause in both firms.
Compact Excavators are often rented on an hourly or daily rate. No meters are used, which means that for billing only calendar hours or days are used. For maintenance, the system has an “engine hour” meter, but this gives indicator only when the system is running (idle or driving or operating).
A proposal is to introduce other meters for more precise counters on the actual use of the machine. One sensor is proposed for the solution, which provides a very cheap way of getting much more usage data.
For the rental case a “power by the hour” rate could be more efficient. I.e. the end customer pays for the required usage or wear of the machinery and not just number of hours the machine is reserved. It would give a more fair pricing model, since the real cost of running the machinery is mostly due to maintenance. This would give the user an incitement of taking care of the machine while using it. It also gives a better way to estimate the need for maintenance or to balance out the usage of equipment.
For other cases, a simple sensor could give benefits of getting higher fleet availability, lowering operating costs etc. by doing the following:
Machine health and how to predict asset failure (predictive maintenance)
Prevent or detect abuse
Provide data for warranty models
Provide data for fleet management/optimization
All of the above mentioned points can be addressed with a simple and robust IMU.
Proof of concept thesis
For this proof of concept, we will provide a thesis, to test the data collection and analytic capability of such a system:
“We believe that we can measure how many hours a hammer and tracks / undercarriage has been used on a compact excavator by measuring the vibration pattern”
As proof of concept we want to be able to detect the following states
Engine Off – ID 4001
Idle – low RPM ID 4002
Idle – High RPM ID 4003
Driving – Turtle gear ID 4010
Driving – Rabbit gear ID 4011
Driving – Slalom ID 4012
Hammer – ID 4020
Other states (such as abuse or hard usage) could also be detected.
The Machine Learning approach
A single IMU sensor is installed in the frame of the vehicle. Data is collected with high resolution and high sampling frequency. Data was collected on a small embedded device in the vehicle.
Model creation data
A series of tests with beforementioned states were made. The data was labeled with each state.
After data labeling, a decision tree was created using statistical features of the data.
The decision tree can now be applied to data collected in real time, on the embedded device.
A new series of tests were made. This data was again labeled with each state. Data was collected and parsed with the decision tree generated with the model data from before (with fixed data chunk sizes).
In the figure below, the results from the algorithms can be seen.
On the top row of bars, the data labels (the truth) are seen, colored. In the next row of bars, the detected states are colored. The bottom graph is a visualization of part of the collected raw data.
As seen, the colors match with very high precision. Only in the beginning and end of the states there are small errors. This is most likely because of the data labeling (i.e. as the labels were created manually with a stopwatch they may not be completely timely)
The IMU sensors and embedded device mounted on the Compact Excavator is able to provide data for machine learning and recognition of at least 6 different usage patterns:
The usage information can now be collected, and a “power by the hour” renting concept can be introduced. For example, the renting company can provide an app where the customer can specify how much hammering they want, and how much driving etc. Then a much lower price can be provided. If data is collected and transmitted through GSM, the app can even update in real time, showing usage data.
This means that the operator of the vehicle can in real time see how much usage has been spent. A warning could be provided when i.e. when 80% of the hammering hours have been spent, similar to traveling with a mobile phone abroad and there is a fixed number of Megabytes available.
The whole setup was made within a few hours. Mounting of the system took 30 minutes, collecting model creation data took 1 hour. Creating the models took 30 minutes. And testing the system took another hour. We started in the morning, and before lunch time, everything was mounted, calibrated and validated and ready for use.
This sensor and embedded system provides a very easy way of providing actual and valid usage information on mechanical systems.
It can easily detect more states. The meters provided could also be summarized, which could be used to provide the operator with information on when it is time to replace the hammer – before it actually breaks. The time saving from this alone are enough to pay for the system.
Cyber-Physical Systems (CPS) are often very complex and require a tight interaction between hardware and software. As it happens in almost any software systems, also CPS generate different kinds of logs of the activities performed, including correct operations, warnings, errors, etc. Frequently, the logs generated are specific to the different subsystems and are generated independently. Such logs contains a wealth of information that needs to be extracted and that can be analyzed in different ways to understand how the single subsystem behaves and even retrieve information about the behavior of the overall system. In particular, considering the generated logs, it is possible to:
Analyze the behavior of a single subsystem looking at the data generated by each one in an independent way;
Analyze the overall behavior of the system looking at the correlations among the data generated by the different subsystems
Such data are very useful to understand the behavior of a system and are often used to perform post-mortem analysis when some failures happen. However, such data could also be used to understand in a more comprehensive way how the system behaves through a real-time analysis able to monitor continuously the different subsystems and their interactions. In particular, it is possible to focus on preventing failures through predictive maintenance triggered by specific analysis.
Making predictions about system failures analyzing log files is possible but such predictions are strictly related to some characteristics of such files. In particular, some very important characteristics are: data generation frequency, information details, history.
The data generation frequency needs to be related to the prediction time and the time required to take proper actions. For example, if we need to detect a failure and take proper action in a few minutes, we need to use data generated with a higher frequency (e.g., in the scale of the seconds) and we cannot use data generated with a lower one (e.g., in the scale of the hours). This requirement affects the ability to make predictions and their usefulness to implement proper maintenance actions.
The information details provided need to include proper granularity and meaningful massages. In particular, it is important to get detailed information about errors, warnings, operations performed, status of the system, etc. The specific details required are tightly connected to the specific predictions that are needed. Moreover, the finer the granularity of the information, the higher are the chances of being able to create a proper prediction model.
High quality data history is required to build proper prediction models. However, just having a large dataset is not enough. Historical data need to be representative of the operating environment and include all the possible cases that may happen during operations. In particular, it is required to have information about the log entries and the actual behavior of the system to create a reliable model of the reality.
The requirements described are just a first step towards the definition of a proper predictive maintenance model but they are essential. Moreover, the proper approaches and algorithms need to be selected based on the specific system and the related operating conditions.
The MANTIS Steel Bending Machine pilot aims at providing the use case owner – ADIRA – a worldwide remote maintenance service to its customers. The main goal is to improve its services by making available new maintenance capabilities with reduced costs, reduce response time, avoiding rework and allowing for better maintenance activities planning.
To this purpose existing ADIRA’s machines (starting with their high end machine model – the Greenbender) will be augmented with extra sensors, which together with information collected from existing sensors will be sent to the cloud to be analyzed. Results made available by the analysis process will be presented to machine operators or maintainers through a HMI interface.
A number of partners are involved into the development and testing of the modules, which regard the communication middleware (ISEP, UNINOVA), data processing and analytics activities (INESC, ISEP), the HMI applications (ISEP), and a stakeholder providing a machine to be enhanced with the MANTIS innovations (ADIRA).
The distributed system being built responds to a reference architecture that is composed by a number of modules, the latter grouped into 4 logical blocks: the Machine under analysis, Data Analysis module, Visualization module, and the Middleware supporting inter-module communications.
Data regarding the machine under analysis are collected by means of sensors, which integrate with the machine itself. This logical block consists thus of data sources that will be used for failure detection, prognosis and diagnosis. This set of data sources comprises an ERP (Enterprise Resource Planning) system, data generated by the machine’s Computer Numerical Controller (CNC) and the safety programmable logical controller (PLC).
This logical block operates through two basic modules. The first is the MANTIS Embedded PC, which is basically an application that can run on a low cost computer (like a Raspeberry Pi) or directly on the CNC (if powerful enough). This module is responsible for collecting the data from the CNC I/O and transmitting it to the Data Analysis engine for processing and is implemented as a communication API. When based on an external computer, this module also connects to the new wireless MANTIS sensors placed on the machine using Bluetooth Low Energy protocol (BLE). Communications are then supported by the RabbitMQ message oriented middleware, which takes care of proper routing of messages between peers. This middleware handles both AMQP and MQTT protocols to communicate between nodes.
The I/O module is used in order to extract raw information from the machine sensors which is collected by the existing PLC, made available on the Windows-based numerical controller through shared memory and then written to files. Our software collects sensor data from these files, thus completely isolating the MANTIS applications from the numerical controller’s application and from the PLC.
This logical block takes care of Data Analysis and Prediction, and it exploits three main modules. The first is a set of prediction models used for the detection, prognosis and diagnosis of the machine failures. The second is an API that allows clients to request predictions from the models, and that can respond to different paradigms such as REST or message-queue based. Finally, the third module is a basic ETL subsystem (Extraction, Transformation and Loading) that is responsible for acquiring, preparing and recording the data that will be used for model generation, selection and testing. This last module is also used to process the analytics request data as the same model generation transformations are also required for prediction.
This logical block consists of two modules, the human machine interface (HMI) and the Intelligent Maintenance DSS. The HMI is designed to be a web-based mobile application, and to be accessed via the network from any computer or tablet. The HMI is developed to work in two different modes, depending on which kind of user is accessing it. In fact, the HMI is developed to support two user types, the data analyst and the maintenance manager, allowing both of them to analyze the machine’s status, record failure and diagnostics related data. Moreover, the data analysis HMI provides an interface with the data analyst, allowing the consultation and analysis of data and results. On the other hand, the maintenance management HMI allows for consulting predicted events and suggested maintenance actions.
The second module is an Intelligent Maintenance DSS, which uses a Knowledge Base that uses diagnosis, prediction models and the data sent by sensors. On top of this Knowledge Base there is a Rule based Reasoning Engine that includes all the rules that are necessary to deduce new knowledge that helps the maintenance crew to diagnose failures.
The work performed so far is well advanced and an integration event will occur in the near future where the interconnection between all systems will be tested and validated.
The demonstrator being built, will be evaluated according to the following criteria: prediction model performance (live data sets will be compared to model generation test sets) and the applications usability (the user should access the required information easily, in order to facilitate failure detection and diagnosis).
The MANTIS project is concerned with predictive maintenance on the basis of big data streams from large (industrial) operations. At the end of the processing pipe line, planning suggestions for maintenance actions will be the result. Usually, maintenance is performed by human operators.
However, with current developments in machine learning, AI and robotics, it becomes interesting to see what type of ‘corrective actions’ in maintenance could be performed by industrial service robots.
In industrial production lines it is common to observe fairly short times between failure, especially in long chains. Whereas individual components are often designed to function extremely well, for instance under a regime of ‘zero-defect manufacturing’, the performance of the line as a whole may be disappointing. What is more, the actions performed by human operators to solve the problems may be very mundane and simple, such as removing dirt due to fouling or lubricating critical components. With the current advances in robot hardware and software technology, it becomes increasingly attractive to automate such maintenance actions. Whereas maintenance in the form of module- or part replacement are too difficult for current state-of-the-art robotics, cleaning and tidying is definitely possible.
With this application domain in mind, a laboratory setup was designed for quickly developing a robotic maintenance task for the purpose of demonstration by a master student team (Francesco Bidoia, Rik Timmers, Marc Groefsema) under guidance of a PhD student (Amir Shantia). We were able to realize a rapid configuration of our existing mobile robot platform to realize simple cleaning and tidying actions, similar to what is needed in basic industrial maintenance tasks. The demonstration involves speech control, navigational autonomy, work piece approach and dynamic reactivity to three object types, using tool switching. Objects are considered to be either a) untouchable, or b) removable by hand, or to consist, c) of small fragments (cf. ‘dirt’) that needs to be brushed away. In three weeks, a full demonstration could be developed by the student team, using a mobile robot with a single arm that was designed earlier, for Robocup@Home tasks:
There are two extreme approaches to predicting failures for predictive maintenance. The white box approach relies on manually constructed physical and mechanical models for predicting the failures. The black box approach, on the other hand, relies on failure prediction models constructed using statistical and machine learning methods based on the data gathered from a running system. The figure below illustrates such data driven failure prediction for a machine monitored by three sensors.
Machine learning algorithms are used to identify failure patterns in the sensor data that precede a machine failure. When such patterns are observed in operation, an alarm can be triggered to take corrective action to prevent or mitigate the eminent failure. For example, failure predictions can be used to optimize the maintenance actions, such as scheduling the service engineers or managing the spare parts storage to reduce the downtime cost.
Automatic feature extraction
An important part of modeling a failure predictor is selecting or constructing the right features, i.e. selecting existing features from the data set, or constructing derivative features, which are most suitable for solving the learning task.
Traditionally, the features are selected manually, relying on the experience of process engineers who understand the physical and mechanical processes in the analyzed system. Unfortunately, manual feature selection suffers from different kinds of bias and is very labor intensive. Moreover, the selected features are specific to a particular learning task, and cannot be easily reused in a different task (e.g. the features which are effective for predicting failures in one production line will not necessarily be effective in a different line).
Deep learning techniques investigated in the MANTIS project offer an alternative to manual feature selection. It refers to a branch of machine learning based on algorithms which automatically extract abstract features from the raw data that are most suitable for solving a particular learning task. Predictive maintenance can benefit from such automatic feature extraction to reduce effort, cost and delay that are associated with extracting good features.
MANTIS; Cyber Physical System based Proactive Collaborative Maintenance.
This project has received funding from the ECSEL Joint Undertaking under grant agreement No 662189. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and Spain, Finland, Denmark, Belgium, Netherlands, Portugal, Italy, Austria, United Kingdom, Hungary, Slovenia, Germany.