Diamond sponsor Microsoft and Azure Databricks customers to present keynotes and breakout sessions at Data + AI Summit 2021. Register for free.
Data + AI Summit 2021 is the global data community event, where practitioners, leaders and visionaries come together to shape the future of data and AI. Data teams will participate from all over the world to level up their knowledge on highly-technical topics presented by leading experts from the industry, research and academia. We are excited to have Microsoft as a Diamond sponsor, bringing Microsoft and Azure Databricks customers together for a lineup of great keynotes and sessions.
Rohan Kumar, Corporate Vice President of Azure Data, returns for the fourth consecutive year as a keynote speaker alongside Azure Databricks customers, including Humana, T-Mobile, Anheuser-Busch InBev, Estée Lauder and EFSA. Below are some of the top sessions to add to your agenda
Keynote with Rohan Kumar
Microsoft During the THURSDAY MORNING KEYNOTE, 8:30 AM – 10:30 AM (PDT)
Rohan Kumar, Corporate Vice President of Azure Data, will join Databricks CEO Ali Ghodsi for a fireside chat to highlight how Azure customers are leveraging open source and open standards using Azure Databricks and other Azure Data services to accelerate data and AI innovation.
Keynote with Sol Rashidi
Estée Lauder During the WEDNESDAY AFTERNOON KEYNOTE, 1:00 PM – 2:30 PM (PDT)
Sol Rashidi, Chief Analytics Officer at Estée Lauder, will be joining us to share insights on how practitioners in the Data + AI community should adopt a product-centric mindset. Prior to Estée Lauder, Sol held executive roles on data strategy at Merck, Sony, Royal Caribbean, EY and IBM.
DevOps for Databricks
Advancing Analytics WEDNESDAY, 12:05 PM – 12:35 PM (PDT)
Applying DevOps to Databricks can be a daunting task. This sessions will break down common DevOps topics including CI/CD, Infrastructure as Code and Build Agents. Explore how to apply DevOps to Databricks (in Azure), primarily using Azure DevOps tooling.
CI/CD in MLOps – Implementing a Framework for Self-Service Everything
J.B. Hunt and Artis Consulting WEDNESDAY, 3:15 PM – 3:45 PM (PDT)
How can companies create predictable, repeatable, secure self-service workflows for their data science teams? Discover how J. B. Hunt, in collaboration with Artis Consulting, created an MLOps framework using automated conventions and well-defined environment segmentation. Attendees will learn how to achieve predictable testing, repeatable deployment and secure self-service Databricks resource management throughout the local/dev/test/prod promotion lifecycle.
Predicting Optimal Parallelism for Data Analytics
Microsoft WEDNESDAY, 3:50 PM – 4:20 PM (PDT)
A key benefit of serverless computing is that resources can be allocated on demand, but the number of resources to request, and allocate, a job can profoundly impact its running time and cost. For a job that has not yet run, how can we provide users with an estimate of how the job’s performance changes with provisioned resources so they can make an informed choice upfront about cost-performance tradeoffs?
Accelerate Analytics On Databricks
Microsoft and WANdisco WEDNESDAY, 4:25 PM – 4:55 PM (PDT)
Enterprises are investing in data modernization initiatives to reduce cost, improve performance, and enable faster time to insight and innovation. These initiatives are driving the need to move petabytes of data to the cloud without interruption to existing business operations. This session will share some “been there done that” stories of successful Hadoop migrations/replications using a LiveData strategy in partnership with Microsoft Azure and Databricks.
ML/AI in the Cloud: Reinventing Data Science at Humana
Humana WEDNESDAY, 4:25 PM – 4:55 PM (PDT)
Humana strives to help the communities it serves achieve the best health – no small task in the past year! The data team at Humana had the opportunity to rethink existing operations and reimagine what a collaborative ML platform for hundreds of data scientists might look like. The primary goal of our ML Platform is to automate and accelerate the delivery lifecycle of data science solutions at scale. In this presentation,walk through an end-to-end example of how to build a model at scale on FlorenceAI and deploy it to production. Tools highlighted include Azure Databricks, MLFlow, AppInsights, and Azure Data Factory.
Advanced Model Comparison and Automated Deployment Using ML
T-Mobile WEDNESDAY, 5:00 PM – 5:30 PM (PDT)
At T-Mobile, when a new account is opened, there are fraud checks that occur both pre- and post-activation. Fraud that is missed has a tendency of falling into first payment default, looking like a delinquent new account. In this session, walk through how the team at T-Mobile leveraged ML in an initiative to investigate newly-created accounts headed towards delinquency and find additional fraud.
Wizard Driven AI Anomaly Detection with Databricks in Azure
Kavi Global WEDNESDAY, 5:00 PM – 5:30 PM (PDT)
Fraud is prevalent in every industry and growing at an increasing rate, as the volume of transactions increases with automation. The National Healthcare Anti-Fraud Association estimates $350B of fraudulent spending. Forbes estimates $25B spending by US banks on anti-money laundering compliance. At the same time, as fraud and anomaly detection use cases are booming, the skills gap of expert data scientists available to perform fraud detection is widening. The Kavi Global team will present a cloud native, wizard-driven AI anomaly detection solution and two client success stories across the pharmaceutical and transportation industries.
Accelerating Data Ingestion with Databricks Autoloader
Advancing Analytics THURSDAY, 11:35 AM – 12:05 PM (PDT)
Tracking which incoming files have been processed has always required thought and design when implementing an ETL framework. The Autoloader feature of Databricks looks to simplify this process, removing the pain of file watching and queue management. However, there can also be a lot of nuance and complexity in setting up Autoloader and managing the process of ingesting data using it. After implementing an automated data loading process in a major US CPMG, Simon Whiteley has some lessons to share from the experience.
Building A Product Assortment Recommendation Engine
Anheuser-Busch InBev THURSDAY, 11:35 AM – 12:05 PM (PDT)
Amid the increasingly competitive brewing industry, the ability of retailers and brewers to provide optimal product assortments for their consumers has become a key goal for business stakeholders. Consumer trends, regional heterogeneities and massive product portfolios combine to scale the complexity of assortment selection. At AB InBev, the data team approaches this selection problem through a two-step method rooted in statistical learning techniques.
With the ultimate goal of scaling this approach to over 100k brick-and-mortar retailers and online platforms, the team implemented its algorithms in custom-built Python libraries using Apache Spark. Learn more in this expert-led session.
Video Analytics At Scale: DL, CV, ML On Databricks Platform
Blueprint Technologies THURSDAY, 3:15 PM – 3:45 PM (PDT)
Don’t miss this live demo and reflection on lessons learned from building and publishing an advanced video analytics solution in the Azure Marketplace. This is a deep technical dive into the engineering and data science employed throughout, with all challenges encountered by combining Deep Learning and Computer Vision for object detection and tracking, the operational management and tool building efforts for scaling the video processing and insights extraction to large GPU/CPU Databricks clusters and the machine learning required to detect behavioral patterns, anomalies and scene similarities across processed video tracks.
The entire solution was built using open source scala, python, spark 3.0, mxnet, PyTorch, scikit-learn as well as Databricks Connect.
Raven: End-to-end Optimization of ML Prediction Queries
Microsoft FRIDAY, 10:30 AM – 11:00 AM (PDT)
ML models are typically part of prediction queries that consist of a data processing part (e.g., for joining, filtering, cleaning, featurization) and an ML part invoking one or more trained models. In this presentation, team members from Microsoft identified significant and unexplored opportunities for optimization. They will present Raven, an end-to-end optimizer for prediction queries. Raven relies on a unified intermediate representation that captures both data processing and ML operators in a single graph structure.
Building the Foundations of an Intelligent, Event-Driven Data Platform at EFSA
EFSA FRIDAY, 10:30 AM – 11:00 AM (PDT)
EFSA is the European agency providing independent scientific advice on existing and emerging risks across the entire food chain. Earlier this year, a new EU regulation (EU 2019/1381) was enacted, requiring EFSA to significantly increase the transparency of its risk assessment processes towards all citizens. To comply with this new regulation, delaware BeLux is helping EFSA in its digital transformation. The team at delaware has been designing and rolling out a modern data platform running on Azure and powered by Databricks that acts as a central control tower brokering data between a variety of applications. It is built around modularity principles, making it adaptable and versatile while keeping the overall ecosystem aligned w.r.t. changing processes and data models. Watch this session to learn how they did it.
Building a Data Science as a Service platform in Azure
Advancing Analytics FRIDAY, 11:05 AM – 11:35 AM (PDT)
ML in the enterprise is rarely delivered by a single team. In order to enable ML across an organization, you need to target a variety of different skills, processes, technologies and maturities. Doing this is incredibly hard and requires a composite of different techniques to deliver a single platform that empowers all users to build and deploy ML models. This session is delivered in collaboration with Ageas Insurance UK and Advancing Analytics. In this session, explore how Databricks enabled a Data Science-as-a-Service platform for Ageas insurance UK that empowers users of all skill levels to build and deploy models and realize ROI earlier.
Build Real-Time Applications with Databricks Streaming
Insight Digital Innovation FRIDAY, 11:40 AM – 12:10 PM (PDT)
In this presentation, study a use case the team at Insight Digital Innovation recently implemented for a large, metropolitan fire department. The company has already created a complete analytics architecture for the department based upon Azure Data Factory, Databricks, Delta Lake, Azure SQL and Azure SQL Server Analytics Services (SSAS). While this architecture works very well for the department, they would like to add a real-time channel to their reporting infrastructure. In this presentation, see how they leverage Databricks, Spark Structured Streaming, Delta Lake and the Azure platform to create this real-time delivery channel.
Sign up today!
Register today for Data + AI Summit 2021! Discover new best practices, learn new technologies, connect with your peers. If you have questions about Azure Databricks or Azure service integrations, meet us in the Microsoft Azure portal at Data + AI Summit.
For more information about Azure Databricks, visit databricks.com/azure