Earlier this year, Databricks launched Dolly 2.0: the world's first truly open instruction-tuned Large Language Model (LLM) . To build off this excitement...
Introduction We are thrilled to unveil the English SDK for Apache Spark, a transformative tool designed to enrich your Spark experience. Apache Spark™...
Delta Lake 1.1 improves performance for merge operations, adds the support for generated columns and improves nested field resolution With the tremendous contributions...
We are excited for the release of Delta Sharing 0.3.0, which introduces several key improvements and bug fixes, including the following features: Delta...
We recently announced the release of Delta Lake 0.8.0 , which introduces schema evolution and performance improvements in merge and operational metrics in...
Initially published April 14th, 2020; updated April 21st, 2020 With the massive disruption of the current COVID-19 pandemic, many data engineers and data...
Try this Loan Risk with AutoML Pipeline API Notebook in Databricks Introduction In the post Using AutoML Toolkit to Automate Loan Default Predictions...
Download the following notebooks and try the AutoML Toolkit today: Evaluating Risk for Loan Approvals using XGBoost (0.90) | Using AutoML Toolkit to...
Try this notebook in Databricks Detecting fraudulent patterns at scale using artificial intelligence is a challenge, no matter the use case. The massive...
Try this notebook in Databricks On October 25th, we hosted a live webinar— Applying your Convolutional Neural Network —with Denny Lee, Technical Product...
Try this notebook in Databricks On September 27th, we hosted a live webinar— Introduction to Neural Networks —with Denny Lee, Technical Product Marketing...
When providing recommendations to shoppers on what to purchase, you are often looking for items that are frequently purchased together (e.g. peanut butter...
With the exponential growth of cameras and visual recordings, it is becoming increasingly important to operationalize and automate the process of video identification...
On August 30th, our team hosted a live webinar— Introducing MLflow: Infrastructure for a complete Machine Learning lifecycle —with Matei Zaharia, Co-Founder and...
Traditionally, real-time analysis of stock data was a complicated endeavor due to the complexities of maintaining a streaming system and ensuring transactional consistency...
How to build an end-to-end predictive data pipeline with Databricks Delta and Spark Streaming Maintaining assets such as compressors is an extremely complex...
Try this notebook series in Databricks Introduction The global sports market is huge, comprised of players, teams, leagues, fan clubs, sponsors, etc., and...
Introduction Graph structures are a more intuitive approach to many classes of data problems. Whether traversing social networks, restaurant recommendations, or flight paths...