Maintenance data analytics; How to deal with free text

Problem description

For analyzing maintenance problems, such as determining the root cause, in general three sources of information are available:

  • sensor data
  • machine generated logging messages
  • service and/or maintenance engineer reports

In contrast to sensor data and machine generated logging messages, of which the format is determined at design time, the service and maintenance reports have a free format, making it hard to analyze automatically.

Problem reporting
Problem reporting

Possible solutions

Before going into possible solutions, a better description of the problem is needed. Free format is this case means that a report can contain

  • unstructured text
  • ambiguous descriptions e.g. mixing commercial and technical identification
  • multiple languages e.g. using local language
  • spelling errors, ad hoc abbreviations, missing data etc.

For easy analysis it would be good if it is possible to bring free text back into the predefined format realm. Ways to solve this are

  • using predefined forms with e.g. dropdown lists for selecting options
  • using intelligent text editors that can recognize the vocabulary needed for describing the specific problems
  • combinations of the above
  • less stringent is specifying rules people have to abide by when entering maintenance logs (typo’s are still possible then)

As it is impossible to predict all problems that will be encountered during the lifetime of a machine, there still needs to be the option to enter free text, but it should be clear that this is only to be used when other options do not cover the problem at hand.


It is clear that free text will always be used in reporting, but in order to facilitate data analysis of service and/or maintenance reports computer aid or rules can make life a lot easier.