Events - Databricks

Events

Filter:

AWS + Databricks Dev Day Workshop | San Francisco

Regional Event

San Francisco, CA

In this workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your ML efforts. We’ll discuss how to leverage Apache Spark™️, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll also learn how to use ML frameworks (i.e. TensorFlow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements. And finally, you can learn how to use MLflow to track experiment runs between multiple users within a reproducible environment, and manage the deployment of models to production on Amazon SageMaker.

Building Reliable Data Lakes with Delta Lake | Hands-on Lab – Dallas

Regional Event

Dallas, TX

Delta Lake is an open source storage layer that brings reliability to data lakes. It has numerous reliability features including ACID transactions, scalable metadata handling, and unified streaming and batch data processing. Delta Lake runs on top of your existing data lake, such as on Azure Data Lake Storage, AWS S3, Hadoop HDFS, or on-premise, and is fully compatible with Apache Spark APIs.

Join this hands-on lab to learn how Delta Lake can help you build robust production data pipelines at scale.

AWS + Databricks Dev Day Workshop | Munich

Partner Event

Europe

In this workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll also learn how to use ML frameworks (i.e. Tensorflow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements. And finally, you can learn how to use MLflow to track experiment runs between multiple users within a reproducible environment, and manage the deployment of models to production on Amazon SageMaker.

Building Reliable Data Lakes with Delta Lake | Hands-on Lab – Menlo Park

Regional Event

Menlo Park, CA

Delta Lake is an open source storage layer that brings reliability to data lakes. It has numerous reliability features including ACID transactions, scalable metadata handling, and unified streaming and batch data processing. Delta Lake runs on top of your existing data lake, such as on Azure Data Lake Storage, AWS S3, Hadoop HDFS, or on-premise, and is fully compatible with Apache Spark APIs.

Join this hands-on lab to learn how Delta Lake can help you build robust production data pipelines at scale.

Top Five Delta Lake Tips

Webinar

Online

Join Quentin Ambard, Solution Architect at Databricks, on this webinar to share with you the best practises and tips on Delta Lake key features: - Create a clean Data Lake with Delta: use schema enforcement and expectation to ensure data quality - Support concurrent queries with ACID transactions: run consistent selects while data is added to the table - Be GDPR ready: safely delete data and merge tables - Go back in time, trace your modification and restore previous data

Paris Data Engineering Meetup #13 ~ Machine Learning Engineering

Meetup

Paris, France

Join the Databricks team at the next Paris Data Engineers Meet-up on the 9th July. Arduino Cascella, Solution Architect will give a 45min presentations on Mlflow, the solution for machine learning lifecycle - from deployment to execution, alongside Xavier Dupre from Microsoft and Romain Sagean from Xebia. The number of seats is limited, save your spot now!

AWS + Databricks Dev Day Workshop | London

Partner Event

Europe

In this workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll also learn how to use ML frameworks (i.e. Tensorflow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements. And finally, you can learn how to use MLflow to track experiment runs between multiple users within a reproducible environment, and manage the deployment of models to production on Amazon SageMaker.

ADS Drinks & Data: ADS Meets SIGMOD

Meetup

Amsterdam, Netherlands

Program

  • 17:30 Welcome drinks
  • 17:55 Introduction
  • Peter Boncz, Professor at VU Amsterdam and Senior Researcher at CWI.
  • 18:00 Invited Talk #1: Infrastructure for Machine Learning: Ideas from Industry and Research
  • Matei Zaharia, the original creator of Apache Spark, professor at Stanford University and Chief Technologist and Co-Founder of Databricks.
  • 18:30 Invited Talk #2: Ceres: Harvesting knowledge from the semi-structured web
  • Xin Luna Dong, the Principal Scientist at Amazon, leading the Amazon
  • Product Knowledge Graph and previously involved in Google’s Knowledge Vault.
  • 19:00 Invited Talk #3: Topic Pages: From Articles to Answers Deep Kayal, Senior Data Scientist at Elsevier, specializing in Natural Language Processing and Machine Learning.

Driving Your Success With Data and AI

Partner Event

Europe

Join Databricks, Microsoft and GoDataDriven for this seminar on Friday, June 28, to develop insight in ways to realize business value from AI. Experienced speakers from Databricks, GoDataDriven, and Microsoft share their experiences with implementing AI solutions that add value to enterprise organizations. This seminar is supported by Xebia, Xpirit and Binx.io.

Delta Lake Meetup: Open Source Reliability for Data Lake with Apache Spark by Michael Armbrust

Meetup

Los Angeles, CA

Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.