Skip to main content
<
Page 11
>

Customer Lifetime Value Part 1: Estimating Customer Lifetimes

Download the Customer Lifetimes Part 1 notebook to demo the solution covered below, and watch the on-demand virtual workshop to learn more. You...

Vectorized R I/O in Upcoming Apache Spark 3.0

June 1, 2020 by Hyukjin Kwon in
R is one of the most popular computer languages in data science, specifically dedicated to statistical analysis with a number of extensions, such...

Adaptive Query Execution: Speeding Up Spark SQL at Runtime

Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Schema Evolution in Merge Operations and Operational Metrics in Delta Lake

May 19, 2020 by Tathagata Das and Denny Lee in
Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. Try this notebook to...

Shrink Training Time and Cost Using NVIDIA GPU-Accelerated XGBoost and Apache Spark™ on Databricks

Guest Blog by Niranjan Nataraja and Karthikeyan Rajendran of Nvidia. Niranjan Nataraja is a lead data scientist at Nvidia and specializes in building...

Now on Databricks: A Technical Preview of Databricks Runtime 7 Including a Preview of Apache Spark 3.0

May 13, 2020 by Yin Huai, Wenchen Fan and Xiao Li in
Introducing Databricks Runtime 7.0 Beta We’re excited to announce that the Apache Spark TM 3.0.0-preview2 release is available on Databricks as part of...

Glow 0.3.0 Introduces New Large-Scale Genomic Analysis Features

April 23, 2020 by Kiavash Kianfar in
In October of last year, Databricks and the Regeneron Genetics Center ® partnered together to introduce Project Glow , an open-source analysis tool...

COVID-19 Datasets Now Available on Databricks: How the Data Community Can Help

April 14, 2020 by Denny Lee in
Initially published April 14th, 2020; updated April 21st, 2020 With the massive disruption of the current COVID-19 pandemic, many data engineers and data...

10 Minutes from pandas to Koalas on Apache Spark

This is a guest community post from Haejoon Lee, a software engineer at Mobigen in South Korea and a Koalas contributor. pandas is...

Trust but Verify with Databricks

As enterprises modernize their data infrastructure to make data-driven decisions, teams across the organization become consumers of that platform. The data workloads grow...