Skip to main content
Company Blog

Edmunds_Databricks

We are happy to announce that Edmunds.com has deployed Databricks to simplify the management of their Apache Spark clusters and perform ad-hoc analysis to improve vehicle data integrity and improve the overall customer experience of their website.

Edmunds.com, a leading car information and shopping network that serves nearly 20 million visitors each month, allows shoppers to browse dealer inventory, vehicle reviews, shopping tips, photos, videos, and feature stories.

To ensure shopper satisfaction, accurate vehicle data is of utmost importance. Edmunds.com solves their data quality issues on vehicle listing pages by matching a car’s VIN (vehicle identification number) against OEM (original equipment manufacturer) and Edmunds codes to identify critical information about the vehicle, such as the country it was built, vehicle year, and more. If done accurately, providing this kind of detailed vehicle information makes Edmunds.com extremely valuable in a shopper’s vehicle buying process.

Over the past couple years, Edmunds.com’s data volumes have grown tenfold from 10’s to 100’s of TB, making it increasingly difficult to accurately decode each VINs and match them to the right vehicle feature codes — resulting in missing or inaccurate details which impacted the customer experience. For example, determining what percentage of Subarus are missing the options details or how many of their Hondas do not have their exterior color described are some of the problems that the Edmunds.com engineering team was trying to fix.

To solve this data integrity problem, Edmunds.com looked to Apache Spark for processing speed at scale. However, they realized that in order for their analysts and data professionals to focus on the data and the business simultaneously, they needed a comprehensive data platform that provided managed services to simplify their Spark deployment and increase their productivity.

With the implementation of Databricks, Edmunds.com was able to democratize data access across their organization, allowing its data engineering, data science, and business analyst teams to work collaboratively on the data at scale. Edmunds.com also achieved the following quantitative results:

  • Accelerated ad hoc data exploration and analysis by six-fold allowing them to answer data integrity questions faster;
  • Improved reporting speed by reducing processing time by 60 percent, or an average of 3-5 hours per week for the engineering team;
  • Improved vehicle data quality metrics across their website by 35 percent.

Download this case study to learn more about how Edmunds.com is using Databricks.

To try out Databricks for yourself, sign up today!