Customer Case Study: Devon Energy - Databricks

Devon Energy

Customer Case Study

Devon Energy

Devon Energy provides oil and natural gas exploration and production services.


Oil and Gas

Vertical Use Case

  • Leveraging data and AI to find oil and gas more efficiently and cost effectively

Technical Use Case

  • Ingest, ETL
  • Machine Learning
  • Deep learning

The Challenges

  • Massive Volumes of Disparate Data: Trying to harness different data sources including geophysical data, geochemical data, drill rig/machinery data, and seismic data was difficult. They used spreadsheets to manipulate data which was slow and not scalable.
  • Infrastructure Complexity: Managing to manage their own Spark clusters resulted in substantial DevOps investments, developer friction, and wasted compute resources.
  • Long Running Queries: Very large data sets (multiple billions of records) made query performance slow — taking up to 2 days to query a sample of data on SQL Server.
  • Machine Learning at Scale: Ability to build, train, and deploy ML models in a repeatable and reproducible manner was impossible due to disjointed systems, different programming languages, etc.

The Solution

Databricks has provided Devon Energy with a fully managed analytics platform on Azure that accelerates AI innovation.

  • Fully Managed Platform: A fully managed cloud platform on Azure simplifies operations and delivers superior performance of data pipelines at scale.
  • Automated Infrastructure Management: Simplified cluster management with auto-scaling significantly reduced time spent on data engineering and development.
  • Faster and More Reliable Data Pipelines at Scale: Removed the complexities of building data pipelines that could scale to meet their data needs.
  • Robust Machine Learning Infrastructure: MLflow greatly streamlined their machine learning lifecycle, simplifying model reproducibility and process repeatability.
  • Interactive Workspace: Data scientists can collaborate, share, and track data and insights across various programming languages, fostering an environment of transparency and improving productivity.

The Results

  • Faster Time to Market: Able to build reliable data pipelines that perform much faster – From 2+ days for a long running query to less than 30 minute.
  • Improved Data Science Productivity: Collaborative notebooks in a centralized platform where they can share and reuse code accelerated data science innovation.

Prior to Azure Databricks, it was taking two days or more to query a subset of data on a SQL Server instance. Now, it takes about 30 minutes for the entire data set.

Paul Bruffett, Data and Analytics Architect at Devon Energy.