Patryk Oleniuk - Databricks

Patryk Oleniuk

Data Engineer, Virgin Hyperloop One

Patryk is a Data Engineer at Virgin Hyperloop One, a company building the 5th mode of transportation. He graduated from EPFL (Swiss Federal Polytechnique in Lausanne) with Information Technologies major. Previously, he was working at CERN, where he wrote test software for the world’s biggest particle accelerator, as well as National Instruments and Samsung R&D. When he isn’t glued to a computer screen, he spends time road-tripping California with his friends.

UPCOMING SESSIONS

Generative Hyperloop Design: Managing Massively Scaled Simulations Focused on Quick-Insight Analytics and Demand Modelling-Summit 2020

Virgin Hyperloop One is the leader in realizing a Hyperloop mass transportation system (VHOMTS), which will bring the cities and people closer together than ever before while reducing pollution, emission of greenhouse gases, transit times, etc. To build a safe and user friendly Hyperloop, we need to answer key technical and business questions, including: - 'What is the safe maximum speed the hyperloop can go?' - 'How many pods (the vehicles that carry people) do we need to fulfill a given demand?' These questions need to be accurately answered to convince regulators, operators, and governments so that we may realize our ambitious goals within years instead of decades. To provide answers to those questions we've built a large-scale and configurable simulation framework, that takes a diverse set of configurations like route information, demand and population information, pod performance parameters. How do we reduce time-to-insight so we can iterate on Hyperloop models faster? We have developed a generic execution and analytics framework around our core system simulation to achieve key objectives of scale, concurrency and speed. In this presentation, we will discuss the design of this framework, challenges encountered, and how these challenges were addressed.

We will showcase the following points in detail:

  1. Utilizing the power of cloud to execute multiple simulations in parallel and at-scale.
  2. Data Pipelines for:
    • Gathering demand and socio-economic data
    • Training and comparing demand prediction models (ARIMA, LSTM, XGBoost) with Keras & MLflow.
    • Analyzing massive simulation output data with Spark and Koalas.
  3. Managing and executing pipelines, including data provenance and element traceability with NiFi and Postgres DB.
  4. How we compare the reports from large batch simulations using MLflow.
  5. A video of our simulation and test result comparisons, including the impact of different demand prediction models for prospective Hyperloop.

PAST SESSIONS