Skip to main content
<
Page 39
>

Automating away engineering on-call workflows at Databricks

May 28, 2020 by Andrew Nitu in
A Summer of Self-healing This summer I interned with the Cloud Infrastructure team. The team is responsible for building scalable infrastructure to support...

Modernizing Risk Management Part 1: Streaming data-ingestion, rapid model development and Monte-Carlo Simulations at Scale

May 27, 2020 by Antoine Amend in
Part 2 of this accelerator here . Managing risk within the financial services , especially within the banking sector, has increased in complexity...

New Pandas UDFs and Python Type Hints in the Upcoming Release of Apache Spark 3.0

May 19, 2020 by Hyukjin Kwon in
Pandas user-defined functions (UDFs) are one of the most significant enhancements in Apache Spark TM for data science. They bring many benefits, such...

Manage and Scale Machine Learning Models for IoT Devices

May 19, 2020 by Conor Murphy in
A common data science internet of things (IoT) use case involves training machine learning models on real-time data coming from an army of...

Schema Evolution in Merge Operations and Operational Metrics in Delta Lake

May 19, 2020 by Tathagata Das and Denny Lee in
Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. Try this notebook to...

Shrink Training Time and Cost Using NVIDIA GPU-Accelerated XGBoost and Apache Spark™ on Databricks

Guest Blog by Niranjan Nataraja and Karthikeyan Rajendran of Nvidia. Niranjan Nataraja is a lead data scientist at Nvidia and specializes in building...

A Convolutional Neural Network Implementation For Car Classification

Convolutional Neural Networks (CNN) are state-of-the-art Neural Network architectures that are primarily used for computer vision tasks. CNN can be applied to a...

Now on Databricks: A Technical Preview of Databricks Runtime 7 Including a Preview of Apache Spark 3.0

May 13, 2020 by Yin Huai, Wenchen Fan and Xiao Li in
Introducing Databricks Runtime 7.0 Beta We’re excited to announce that the Apache Spark TM 3.0.0-preview2 release is available on Databricks as part of...

How to build a Quality of Service (QoS) analytics solution for streaming video services

Click on the following link to view and download the QoS notebooks discussed below in this article. Contents The Importance of Quality to...

Faster SQL Queries on Delta Lake with Dynamic File Pruning

There are two time-honored optimization techniques for making queries run faster in data systems: process data at a faster rate or simply process...