Skip to main content
<
Page 12
>

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0 . We extend...

How Collective Health uses Delta Live Tables and Structured Streaming for Data Integration

April 13, 2023 by Mragesh Khandelwal and Mahmoud Saleh in
Collective Health is not an insurance company. We're a technology company that's fundamentally making health insurance work better for everyone— starting with the...

Synthetic Data for Better Machine Learning

April 11, 2023 by Sean Owen in
You've likely tried the buzziest advances in generative AI in the past year, tools like ChatGPT and DALL-E . They consume complex data...

Visual data modeling using erwin Data Modeler by Quest on the Databricks Lakehouse Platform

This is a collaborative post between Databricks and Quest Software. We thank Vani Mishra, Director of Product Management at Quest Software for her...

Saving Mothers with ML: How CareSource uses MLOps to Improve Healthcare in High-Risk Obstetrics

This blog post is in collaboration with Russ Scoville (Vice President of Enterprise Data Services), Arpit Gupta (Director of Predictive Analytics and Data...

Pandas-Profiling Now Supports Apache Spark

Data profiling is the process of collecting statistics and summaries of data to assess its quality and other characteristics. It is an essential...

Run SQL Queries on Databricks From Visual Studio Code

Today, we are excited to announce that users can now run SQL queries on Databricks from within Visual Studio Code via a preview...

Fine-Tuning Large Language Models with Hugging Face and DeepSpeed

March 20, 2023 by Sean Owen in
Large language models (LLMs) are currently in the spotlight following the sensational release of ChatGPT. Many are wondering how to take advantage of...

Building the Lakehouse for Healthcare and Life Sciences - Processing DICOM images at scale with ease

One of the biggest challenges in understanding patient health status and disease progression is unlocking insights from the vast amounts of semi-structured and...

Unsupervised Outlier Detection on Databricks

Kakapo ( KAH-kə-poh ) implements a standard set of APIs for outlier detection at scale on Databricks. It provides an integration of the...