Skip to main content
<
Page 42
>

Better Machine Learning through Active Learning

January 15, 2020 by Sean Owen in
Try this notebook to reproduce the steps outlined below Machine learning models can seem like magical savants. They can distinguish hot dogs from...

Processing Geospatial Data at Scale With Databricks

December 4, 2019 by Nima Razavi and Michael Johns in
This blog was written 3 years ago. Please refer to these articles for up-to-date approaches to geospatial processing and analytics with your Databricks...

Streamlining Variant Normalization on Large Genomic Datasets with Glow

December 4, 2019 by Kiavash Kianfar in
Cross posted from the Glow blog . Many research and drug development projects in the genomics world involve large genomic variant data sets...

New Databricks Integration for Jupyter Bridges Local and Remote Workflows

December 2, 2019 by Bernhard Walter in
Introduction For many years now, data scientists have developed specific workflows on premises using local filesystem hierarchies, source code revision systems and CI/CD...

Migration from Hadoop to Modern Cloud Platforms: The Case for Hadoop Alternatives

November 27, 2019 by Anand Venugopal and James Nguyen in
Companies rely on their big data and analytics platforms to support innovation and digital transformation strategies. However, many Hadoop users struggle with complexity...

Deep Learning Tutorial Demonstrates How to Simplify Distributed Deep Learning Model Inference Using Delta Lake and Apache Spark™

November 20, 2019 by Cyrielle Simeone in
On October 10th, our team hosted a live webinar— Simple Distributed Deep Learning Model Inference —with Xiangrui Meng, Software Engineer at Databricks. Model...

Using AutoML Toolkit's FamilyRunner Pipeline APIs to Simplify and Automate Loan Default Predictions

November 5, 2019 by Jas Bali and Denny Lee in
Try this Loan Risk with AutoML Pipeline API Notebook in Databricks Introduction In the post Using AutoML Toolkit to Automate Loan Default Predictions...

Scalable Near Real-Time S3 Access Logging Analytics with Apache Spark™ and Delta Lake

Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. The original blog is...

Scaling Hyperopt to Tune Machine Learning Models in Python

October 28, 2019 by Joseph Bradley and Max Pumperla in
Try the Hyperopt notebook to reproduce the steps outlined below and watch our on-demand webinar to learn more. Hyperopt is one of the...

Scaling Financial Time Series Analysis Beyond PCs and Pandas: On-Demand Webinar, Slides and FAQ Now Available!

On Oct 9th, 2019, we hosted a live webinar — Scaling Financial Time Series Analysis Beyond PCs and Pandas — with Junta Nakai...