Databricks, diamond sponsor Microsoft and Azure Databricks customers to present keynotes and breakout sessions at Data + AI Summit Europe.
Data + AI Summit Europe is the free virtual event for data teams — data scientists, engineers and analysts — who will tune in from all over the world to share best practices, discover new technologies, connect and learn. We are excited to have Microsoft as a Diamond sponsor, bringing Microsoft and Azure Databricks customers together for a lineup of great keynotes and sessions.
Rohan Kumar, Corporate Vice President of Azure Data, returns as a keynote speaker for the third year in a row, along with presenters from a number of Azure Databricks customers including Unilever, Daimler, Henkel, SNCF, Fluvius, Kaizen Gaming and DataSentics. Below are some of the top sessions to add to your agenda:
KEYNOTE
Keynote from Phinean Woodward
Unilever: During the WEDNESDAY MORNING KEYNOTE, 8:30 AM - 10:30 AM (GMT)
Phinean Woodward, Head of Architecture, Information and Analytics, Unilever
KEYNOTE
Keynote from Stephan Schwarz
Daimler: During the THURSDAY MORNING KEYNOTE, 8:30 AM - 10:30 AM (GMT)
Stephan Schwarz, Production Planning: Manager Smart Data Processing (Mercedes Operations), Daimler
KEYNOTE
Keynote from Rohan Kumar
Microsoft: During the THURSDAY MORNING KEYNOTE, 8:30 AM - 10:30 AM (GMT)
Rohan Kumar, Corporate Vice President, Azure Data, Microsoft
Sarah Bird, AI Research and Products, Microsoft
Responsible ML is the most talked about field in AI at the moment. With the growing importance of ML, it is even more important for us to exercise ethical AI practices and ensure that the models we create live up to the highest standards of inclusiveness and transparency. Join Rohan Kumar, as he talks about how Microsoft brings cutting-edge research into the hands of customers to make them more accountable for their models and responsible in their use of AI. For the AI community, this is an open invitation to collaborate and contribute to shape the future of Responsible ML.
Building the Next-gen Digital Meter Platform for Fluvius
Fluvius WEDNESDAY, 3:35 PM - 4:05 PM (GMT)
Fluvius is the network operator for electricity and gas in Flanders, Belgium. Their goal is to modernize the way people look at energy consumption using a digital meter that captures consumption and injection data from any electrical installation in Flanders ranging from households to large companies. After full roll-out there will be roughly 7 million digital meters active in Flanders collecting up to terabytes of data per day. Combine this with regulation that Fluvius has to maintain a record of these readings for at least 3 years, we are talking petabyte scale. delaware BeLux was assigned by Fluvius to set up a modern data platform and did so on Azure using Databricks as the core component to collect, store, process and serve these volumes of data to every single consumer and beyond in Flanders. This enables the Belgian energy market to innovate and move forward. Maarten took up the role as project manager and solution architect.
Building a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes
DataSentics WEDNESDAY, 3:35 PM - 4:05 PM (GMT)
Getting machine learning models to production is notoriously difficult: it involves multiple teams (data scientists, data and machine learning engineers, operations, …), who often does not speak to each other very well; the model can be trained in one environment but then productionalized in completely different environment; it is not just about the code, but also about the data (features) and the model itself… At DataSentics, as a machine learning and cloud engineering studio, we see this struggle firsthand – on our internal projects and client’s projects as well.
To address the issue, we decided to build a dedicated MLOps platform, which provides the necessary tooling, automations and standards to speed up and robustify the model productionalization process. The central piece of the puzzle is MLflow, the leading open-source model lifecycle management tool, around which we develop additional functionality and integrations to other systems – in our case primarily the Azure ecosystem (e.g. Azure Databricks, Azure DevOps or Azure Container Instances). Our key design goal is to reduce the time spent by everyone involved in the process of model productionalization to just a few minutes.
The Pill for Your Migration Hell
Microsoft WEDNESDAY, 4:45 PM - 5:15 PM (GMT)
This is the story of a great software war. Migrating Big Data legacy systems always involve great pain and sleepless nights. Migrating Big Data systems with Multiple pipelines and machine learning models only adds to the existing complexity. What about migrating legacy systems that protect Microsoft Azure Cloud Backbone from Network Cyber Attacks? That adds pressure and immense responsibility. In this session, we will share our migration story: Migrating a machine learning-based product with thousands of paying customers that process Petabytes of network events a day. We will talk about our migration strategy, how we broke down the system into migrationable parts, tested every piece of every pipeline, validated results, and overcome challenges. Lastly, we share why we picked Azure Databricks as our new modern environment for both Data Engineers’ and Data Scientists’ workloads.
End to End Supply Chain Control Tower
Henkel THURSDAY, 11:00 AM - 11:30 AM (GMT)
When you look at traditional ERP or management systems, they are usually used to manage the supply chain originating from either the point of Origin or point of destination which all are primarily physical locations. And for these, you have several processes like order to cash, source to pay, physical distribution, production etc.
Our supply chain control tower is not tied up to a single location nor confined to a single part in the supply network hierarchy. Our control tower focuses on gathering and storing real-time data, and offers a single point of information related to all data points. We are able to aggregate data from different inventory, warehouse, production, planning, etc. to guide improvements and mitigate exceptions keeping in mind an efficient supply network operations in our end to end value chain.
Which allows us to do cross-functional data-based applications, one such example is like Digital Sales and operations planning. Which is a very powerful tool to align operations execution with our financial goals.
All this is possible, by using a future proof big data architecture and strong partnership with their respective suppliers such as microsoft and Databricks.
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
DataSentics THURSDAY, 1:35 PM - 2:05 PM (GMT)
Ceska sporitelna is one of the largest banks in Central Europe and one it’s main goals is to improve the customer experience by weaving together the digital and traditional banking approach. The talk will focus on the real world (both technical and enterprise) challenges during shifting the vision from powerpoint slides into production: Implementing Spark and Databricks-centric analytics platform in the Azure cloud combined with a on-prem data lake in the EU-regulated financial environment Forming a new team focused on solving use cases on top of C360 in the 10 000+ employee enterprise Demonstrating this effort on real use cases such as client risk scoring using both offline and online data Spark and its MLlib as an enabler for employing hundreds of millions of client interactions personalized omni-channel CRM campaigns.
Personalization Journey: From Single Node to Cloud Streaming
Kaizen THURSDAY, 1:35 PM - 2:05 PM (GMT)
In the online gaming industry we receive a vast amount of transactions that need to be handled in real time. Our customers get to choose from hundreds or even thousand options, and providing a seamless experience is crucial in our industry. Recommendation systems can be the answer in such cases but require handling loads of data and need to utilize large amounts of processing power. Towards this goal, in the last two years we have taken down the road of machine learning and AI in order to transform our customer’s daily experience and upgrade our internal services.
In this long journey we have used the Databricks on Azure Cloud to distribute our workloads and get the processing power flexibility that is needed along with the stack that empowered us to move forward. By using MLflow we are able to track experiments and model deployment, by using Spark Streaming and Kafka we moved from batch processing to Streaming and finally by using Delta Lake we were able to bring reliability in our Data Lake and assure data quality. In our talk we will share our transformation steps, the significant challenges we faced and insights gained from this process.
Building a Streaming Data Pipeline for Trains Delays Processing
SNCF THURSDAY, 2:10 PM - 2:40 PM (GMT)
SNCF (French National Railway Company) has distributed a network of beacons over its 32,000 km of train tracks, triggering a flow of events at each train passage. In this talk, we will present how we built real-time data processing on these data, to monitor traffic and map the propagation of train delays. During the presentation we will demonstrate how to build an end to end solution, from ingestion to exposure.