Skip to main content
<
Page 9
>

Modernizing Risk Management Part 2: Aggregations, Backtesting at Scale and Introducing Alternative Data

June 5, 2020 by Antoine Amend in
Understanding and mitigating risk is at the forefront of any financial services institution. However, as previously discussed in the first blog of this...

Customer Lifetime Value Part 1: Estimating Customer Lifetimes

Download the Customer Lifetimes Part 1 notebook to demo the solution covered below, and watch the on-demand virtual workshop to learn more. You...

Monitor Your Databricks Workspace with Audit Logs

June 2, 2020 by Craig Ng and Miklos Christine in
Cloud computing has fundamentally changed how companies operate - users are no longer subject to the restrictions of on-premises hardware deployments such as...

Vectorized R I/O in Upcoming Apache Spark 3.0

June 1, 2020 by Hyukjin Kwon in
R is one of the most popular computer languages in data science, specifically dedicated to statistical analysis with a number of extensions, such...

Adaptive Query Execution: Speeding Up Spark SQL at Runtime

Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Modernizing Risk Management Part 1: Streaming data-ingestion, rapid model development and Monte-Carlo Simulations at Scale

May 27, 2020 by Antoine Amend in
Part 2 of this accelerator here . Managing risk within the financial services , especially within the banking sector, has increased in complexity...

New Pandas UDFs and Python Type Hints in the Upcoming Release of Apache Spark 3.0

May 19, 2020 by Hyukjin Kwon in
Pandas user-defined functions (UDFs) are one of the most significant enhancements in Apache Spark TM for data science. They bring many benefits, such...

Manage and Scale Machine Learning Models for IoT Devices

May 19, 2020 by Conor Murphy in
A common data science internet of things (IoT) use case involves training machine learning models on real-time data coming from an army of...

Shrink Training Time and Cost Using NVIDIA GPU-Accelerated XGBoost and Apache Spark™ on Databricks

Guest Blog by Niranjan Nataraja and Karthikeyan Rajendran of Nvidia. Niranjan Nataraja is a lead data scientist at Nvidia and specializes in building...

Now on Databricks: A Technical Preview of Databricks Runtime 7 Including a Preview of Apache Spark 3.0

May 13, 2020 by Yin Huai, Wenchen Fan and Xiao Li in
Introducing Databricks Runtime 7.0 Beta We’re excited to announce that the Apache Spark TM 3.0.0-preview2 release is available on Databricks as part of...