Quality propagation using artificial neural networks

Objectives:

The aim of this research work is the propagation of data quality within data processing. Quality of input and output data of a process can be described comprehensively by using quality models and appropriate quality parameters. Now we face the problem of linking the quality of output information with the quality of input information e. g. to predict the quality of the results. In general, the formulation of an accurate relation is quite complex or impossible. Therefore other methods are necessary. As it turned out, artificial neural networks (ANN) are qualified for this task.

knn

 

 

 

 

 

 

 

 

 

Figure 1: Example of a feed-forward ANN with 4 input and 2 output variables. (© iigs)

Procedure:

Artificial neural networks are networks consisting of single neurons, which can process information such as the natural prototype. So called feed-forward-networks are used for quality propagation. Feed-forward-networks consist of neurons which are arranged in layers, the flow direction is fixed and shortcuts or back coupling is not allowed. The main advantages of ANN are the complex and parallel data processing of information, the robustness against signal noise and the usability of a trained ANN for real-time applications. A feed-forward-ANN can be trained with the help of training examples. All parameter of the ANN are determined within the training-phase and the trained network is capable for processing any kind of input data out of the trained input data range in real-time. An appropriate preparation of data is necessary for using an ANN to propagate data quality. All relevant (quality) parameter of the training data have to be grouped into input and output vectors and normalized. Within the training phase, the best net configuration has to be identified by testing. The more complex the task and number of quality parameter, the more complex has to be the net in general. If the network is trained sufficiently, it is capable to predict from every input vector out of the trained data range the corresponding output vector.

Results:

The usability of ANN for propagation of data quality could be shown within a simple geodetic testing example. Regarding polar point determination, standard deviations could be calculated with sufficient accuracy by ANN. It could be also shown, that mapping of availability, consistency and completeness of data is possible by using ANN.

knn_fehler

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 2: Errors of predicted cross deviation of 50 example trajectories for different net dimensions

 

Within another real example, data out of the project Do-iT were used. The highly complex calculation process for determining trajectories of mobile phone users from mobile phone data, leads as well to complex dependencies between quality of input data and output data. The formulation of an accurate analytical relation is not possible. However, as the analysis showed, propagation of data quality works with ANN. The cross deviation of trajectories can be determined with a standard deviation of approx. 35-45 m, using a number of input parameter such as the density of the radio cell network and the density of the digital road network. The cross deviation was defined for evaluation of the calculated trajectories and describes the mean normal deviation of all positions of each user from the estimated trajectory. Figure 2 shows the errors of 50 randomly chosen examples of trajectories for different net dimensions. The error of the predicted cross deviations ranges mainly between +/- 50 m.

The fields of applications for ANN to propagate data quality will be further researched at the Institute. Particularly the use of so called dynamic networks to describe timeliness of data is part of the research. Furthermore there are more real examples necessary for deepening the knowledge.

Contact:

Prof. Dr.-Ing. habil. Volker Schwieger