Performance evaluation of GANs in a semi-supervised OCR use case

Even in the age of big data labelled data is a scarce resource in many machine learning use cases. We evaluate generative adversarial networks (GANs) at the task of extracting information from vehicle registrations under a varying amount of labelled data and compare the performance with supervised learning techniques. Using unlabelled data shows a significant improvement.

Tags: Artificial Intelligence, Computer Vision, Deep Learning & Artificial Intelligence, Data Science, Machine Learning

Scheduled on wednesday 14:00 in room media

Speaker

Florian Wilhelm (@FlorianWilhelm)

I am a Data Scientist living in Cologne, Germany with a mathematical background. After my postdoctoral position I started as a Data Scientist at Blue Yonder, the leading platform provider for Predictive Applications and Big Data in the European market. Right now I enjoy working on innovative Data Science projects with experts every day at inovex. With more than five years of project experience in the field of Predictive & Prescriptive Analytics and Big Data, I have acquired profound knowledge in the domains of mathematical modelling, statistics, machine learning, high-performance computing and data mining.

For the last years I programmed mostly with the Python Data Science stack (NumPy, SciPy, Scikit-Learn, Pandas, Matplotlib, Jupyter, etc.) to which I also contributed several extensions.

Description

Online vehicle marketplaces are embracing artificial intelligence to ease the process of selling a vehicle on their platform. The tedious work of copying information from the vehicle registration document into some web form can be automated with the help of smart text spotting systems. The seller takes a picture of the document and the necessary information is extracted automatically.

We introduce the components of a text spotting system including the subtasks of object detection and character object recognition (OCR). In view of our use case, we elaborate on the challenges of OCR in documents with various distortions and artefacts which rule out off-the-shelve products for this task.

After an introduction of semi-supervised learning based on generative adversarial networks (GANs), we evaluate the performance gains of this method compared to supervised learning. More specifically, for a varying amount of labelled data the accuracy of a convolution neural network (CNN) is compared to a GAN which uses additionally unlabelled data during the training phase.

We conclude that GANs significantly outperform classical CNNs in use cases with a lack of labelled data. Regarding our use case of extracting information from vehicle registration documents, we show that our text spotting system easily exceeds an accuracy of 99.5% thus making it applicable in a real-world use case.