Events - Databricks



Automated Hyperparameter Tuning, Scaling and Tracking on Databricks


In this talk, we'll start with a brief survey of the most popular techniques for hyperparameter tuning (e.g., grid search, random search, and Bayesian optimization). We will then discuss open source tools that implement each of these techniques, helping to automate the search over hyperparameters. Finally, we will discuss and demo improvements we built for these tools in Databricks, including integration with MLflow:

  • Apache PySpark MLlib integration with MLflow for automatically tracking tuning
  • Hyperopt integration with Apache Spark to distribute tuning and with MLflow for automatic tracking

Getting Data Ready for Data Science


Successful data science relies on solid data engineering to furnish reliable data. Delta Lake is an open source storage layer that brings reliability to data lakes allowing you to provide reliable data for data science and analytics. This webinar will cover modern data engineering in the context of the data science lifecycle and how the use of Delta Lake can help enable your data science initiatives.