Skip to main content
<
Page 13
>

Navigating the Databricks Lakehouse Like a Pro

September 28, 2022 by Justin Kim in
At Databricks, we love helping you be as efficient as possible—whether through simplifying the modern data stack with the Lakehouse or saving costs...

Mitigating Bias in Machine Learning With SHAP and Fairlearn

September 16, 2022 by Sean Owen in
Try this notebook in Databricks. With good reason, data science teams increasingly grapple with questions of ethics, bias and unfairness in machine learning...

Faster insights With Databricks Photon Using AWS i4i Instances With the Latest Intel Ice Lake Scalable Processors

This is a collaborative post from Databricks and Intel. We thank the authors from Intel for their contributions. Customers can now leverage Databricks...

Announcing Public Preview of Data Lineage in Unity Catalog

September 12, 2022 by Paul Roome, Sachin Thakur and Tao Feng in
Today, we are excited to announce the public preview of data lineage in Unity Catalog , available on AWS and Azure. In the...

Previewing Updates to the Databricks Notebook

September 1, 2022 by Austin Ford in
At Databricks, we are committed to delivering a world-class, data-driven development experience in the Notebook, and we are very excited to preview the...

Cybersecurity in the Era of Multiple Clouds and Regions

In 2021, more than three quarters of all enterprises have infrastructure in multiple clouds . This trend shows no signs of slowdown with...

Databricks Workspace Administration – Best Practices for Account, Workspace and Metastore Admins

This blog is part of our Admin Essentials series, where we discuss topics relevant to Databricks administrators. Other blogs include our Workspace Management...

Feature Deep Dive: Watermarking in Apache Spark Structured Streaming

August 22, 2022 by Max Fisher in
Key Takeaways Watermarks help Spark understand the processing progress based on event time, when to produce windowed aggregates and when to trim the...

Orchestrating Data and ML Workloads at Scale: Create and Manage Up to 10k Jobs Per Workspace

Databricks Workflows is the fully-managed orchestrator for data, analytics, and AI. Today, we are happy to announce several enhancements that make it easier...

Low-latency Streaming Data Pipelines with Delta Live Tables and Apache Kafka

August 9, 2022 by Frank Munz in
Delta Live Tables (DLT) is the first ETL framework that uses a simple declarative approach for creating reliable data pipelines and fully manages...