We are proud to announce that HomeAway, a subsidiary of Expedia and one of the world’s leading online marketplaces for the vacation rental industry, has selected Databricks to simplify its big data needs and improve the company’s ability to match a traveler with the right property.
Travelers from around the world use HomeAway’s online marketplace to search for vacation rentals. To facilitate a match between the traveler and vacation rental, HomeAway must show search results that are relevant to the traveler’s specific interests.
HomeAway analyzes a high volume of unstructured data — log events, text (in multiple languages), and images — to provide search results in order of relevance to the traveler. They also leverage contextual image classification to interprets contextual information within an image so they can map the image with the highest relevance to the search criteria. For example, when a traveler selects “beachfront” as a filter, they are able to curate and prioritize photos that contain the images of the beach.
The challenge of search and content relevancy requires the ability to ETL a large variety of unstructured data quickly. In HomeAway’s case, they needed to move data from their on-premises HDFS to AWS S3 for exploration and analysis. They also had the growing need to merge their data with Expedia’s to create predictive models across all of Expedia’s websites.
Initially, they tried using open source Apache Spark, bundled in JARs, and executed via a spark-submit script and Zeppelin notebooks. But they quickly found that their lack of Spark expertise, pains of upgrading Hadoop, and the challenges of using R to calculate prediction-based document similarity tasks on samples of data on a single machine proved to be too time consuming and resource intensive.
With Expedia’s strategic initiative to move completely into the cloud, HomeAway needed a platform that enabled them to access large volumes of data on S3, while providing an interactive and highly scalable environment that allows for rapid prototyping and question asking to uncover future machine learning and streaming use cases.
With Databricks simplifying their Spark infrastructure, HomeAway’s data science team can now focus on delivering innovative new features that enhance the overall user experience. Looking into the future, HomeAway is also exploring more joint analytic opportunities with the rest of Expedia to deliver a unified user experience that takes advantage of all the data under their collective umbrella.
Download this case study to learn more about how HomeAway is using Databricks.