We are excited to announce the release of Databricks Runtime 5.2 which introduces several new features including the following:
- Delta Time Travel
- Fast Parquet Import
- Databricks Advisor
Let’s unpack each of these features in more detail:
Delta Time Travel
Time Travel, released as an Experimental feature, adds the ability to query a snapshot of a table using a timestamp string or a version, using SQL syntax as well as DataFrameReader options for timestamp expressions.
SELECT count() FROM events TIMESTAMP AS OF timestamp_expression
SELECT count() FROM events VERSION AS OF version
Delta Time Travel is useful for
- Re-creating analyses, reports, or outputs (for example, the output of a machine learning model), which is useful for debugging or auditing, especially in regulated fields.
- Writing complex temporal queries.
- Fixing mistakes in your data.
- Providing snapshot isolation for a set of queries for fast changing tables.
Fast Parquet Import
Fast Parquet import enables users to import Parquet files into a Delta table without copying data. This feature makes it easier to convert existing Parquet tables and migrate pipelines to Delta. For more details please see the documentation(Azure | AWS).
CONVERT TO DELTA parquet.
path/<b>to</b>/<b>table</b> [NO STATISTICS]
[PARTITIONED BY (col_name1 col_type1, col_name2 col_type2, …)]
Databricks Advisor is a new feature within Notebooks. It automatically analyzes commands and displays advice notifications to assist you in improving the performance of your query. We are launching Databricks Advisor with two hint types in this release: for DBIO cache and range joins. There will be more advice to come in future releases. For more details please see the documentation(Azure | AWS).