Modern data lakes leverage cloud elasticity to store virtually unlimited amounts of data "as is", without the need to impose a schema or structure. Structured Query Language (SQL) is a powerful tool to explore your data and discover valuable insights. Delta Lake is an open-source storage layer that brings reliability to data lakes with ACID transactions, scalable metadata handling, and unified streaming and batch data processing. Delta Lake is fully compatible with your existing data lake. Join Databricks and Microsoft as we share how you can easily query your data lake using SQL and Delta Lake on Azure.
Successful data science relies on solid data engineering to furnish reliable data. Delta Lake is an open source storage layer that brings reliability to data lakes allowing you to provide reliable data for data science and analytics. This webinar will cover modern data engineering in the context of the data science lifecycle and how the use of Delta Lake can help enable your data science initiatives.
With the New Year comes new data privacy regulations. Among them is the California Consumer Privacy Act of 2020, which went into effect January 1. But CCPA is just the beginning.
Companies are collecting more data than ever before across all divisions and groups, and putting that data in the cloud. Data scientists and engineers are at the forefront of the search for key business insights in that data to improve decision making. Databricks is their platform of choice for massive-scale data engineering and collaborative data science. MATLAB users can now use their trusted troves of domain-specific tools and algorithms on such big data with Databricks. No need to invest significant time learning, coding and re-implementing algorithms, simply leverage the new MATLAB interface for Databricks.
March 5th , 2020. 3pm Sydney / 12pm Singapore / 9.30am Mumbai
A common data engineering pipeline architecture uses tables that correspond to different quality levels, progressively adding structure to the data: data ingestion (“Bronze” tables), transformation/feature engineering (“Silver” tables), and machine learning training or prediction (“Gold” tables). Combined, we refer to these tables as a “multi-hop” architecture. It allows data engineers to build a pipeline that begins with raw data as a “single source of truth” from which everything flows. In this session, we will show how to build a scalable data engineering data pipeline using Delta Lake.