Customer Case Study


Autotrader is an online marketplace for car shoppers and sellers. It aggregates millions of new, used, and certified second-hand cars from thousands of dealers and private sellers.


Automotive / Consumer Technology

Vertical Use Case

Recommendation Engine – Leverage machine learning to predict car valuations which helps car sellers to understand what their car is worth given age, mileage, and condition.

Technical Use Case

  • Data Ingest and ETL
  • Streaming
  • Machine Learning

The Challenges

  • They were not able to scale on a legacy technology stack to process and analyze a massive volume of data that needed to be prepared for downstream analytics.
  • Collaboration challenges across data scientists and engineers, especially when we’re moving a model from a development stage, where data scientists are working on it, to a production stage where we’re expecting a data engineer to pick up the model and implement it.
  • The combination of EMR and Jupyter notebooks created too much DevOps complexity which slowed productivity.
  • The lack of experience with Apache Spark, productionizing models, and using them in real-time scoring systems.

The Solution

Databricks provides Autotrader with a unified analytics platform that has fostered a collaborative environment across data science and engineering, allowing them to innovate faster.

  • Automated cluster management simplifies the provisioning of clusters at any scale.
  • Support for multiple languages (SQL, Scala, Python, R) to ensure all team members are productive within the collaborative notebook environment.
  • Able to easily build, train and deploy machine learning models easier and faster.

The Results

Databricks has provided Autotrader with a Unified Analytics Platform that serves as an entry point to their data and analytics, making it accessible to people across different disciplines, including data scientists and developers. It has not only improved the collaboration across the data science and engineering teams at Autotrader, but has also help streamline and improve the process of training and deploying machine learning models — speeding time-to-market for new models by 3x.

With Databricks and Apache Spark, we’ve been able to move a particular model from development straight into production with a very limited amount of time and operational complexity.

Edward Kent - Developer at Autotrader