Oliver Lemp

Data Scientist, ENGEL Austria GmbH

With a strong foundation in Arts and Bioinformatics, Oliver has gathered further knowledge in the fields of web development and data science at various (startup) companies and research institutions. In his previous internships/employments he was focusing on bioinformatics algorithms and NLP on social media data.

After that, he made a hard change in industries and has been working at ENGEL, Austria’s largest machine manufacturer, for 2 years now as the leading engineer for data science. Since then, he is facing new challenges every day (e.g., understanding injection moulding machines) and is trying to organize and make sense of machine-data – while attempting to adopt the value of data science and creativity in the traditional machine manufacturing sector.

Past sessions

Summit Europe 2020 So erschließen Sie das Potenzial älterer Maschinendaten

November 18, 2020 04:00 PM PT

ENGEL, which was founded in 1945, now is the leading manufacturer for injection moulding machines on the global market. Since then, and especially in the current era, the amount of data has grown immensely and has also become more and more heterogenous due to newer generations of machine controls. Taking a closer look at the conglomerations of each and every machine’s log files, one can find 13 different types of timestamps, different archive types and more peculiarities of each control generation. Apparently, this has led to certain problems in automatically processing and analysing the data.

In this talk, you will explore how ENGEL managed to centralise this data in only one place, how ENGEL set up a data pipeline to ingest batch-oriented data in a streaming fashion and how ENGEL migrated their pipeline from an on-premise Hadoop setup to the cloud using Databricks.

Together with Oliver Lemp, Data Scientist for ENGEL, dive into the journey of integrating legacy data where you will learn how to manage the following aspects:

  • Ingesting classified and heterogenous data from field engineers
  • Unwrapping legacy data with native libraries in Spark
  • Moving a batch-oriented architecture to a streaming architecture
  • Partitioning and maintaining non-time series data for millions of variables

Speaker: Oliver Lemp