by Bryan Saftler and Steve Sobel
Customer data is the lifeblood of modern organizations in every industry. As organizations level-up their data teams and practices with the Data Lakehouse, they’re increasingly using the Lakehouse not just as a center for source-of-truth for analytics–but also as an engine powering marketing, operations, personalization, and more.
Databricks Ventures invested in Hightouch to power the Data Lakehouse-native Customer Data Platform (CDP). Hightouch provides all the features that Databricks users need to collect, store, model, and activate customer data directly from the Lakehouse. This Lakehouse-centric architecture creates a complete Composable CDP centered on your own data infrastructure. Read this blog to learn more about what this Lakehouse-native Composable CDP really means, why it’s the best approach to customer data, and, most importantly, how you can get started building one yourself.
CDPs offer companies a way to gather, store, model, and activate customer data. Ultimately, they power use cases for this customer data by sending it to downstream tools that marketers, advertisers, and other business users rely on daily. CDPs promise to be the source of truth for customer data and help companies build a 360-degree view of the customer of each customer.
Generally, a CDP has several key components:
Historically, CDP solutions were all-in-one bundled platforms. If you purchase a traditional CDP, it will collect and model customer data within its own dedicated data storage and provide tooling to build audiences and activate them from this separate storage layer.
The Composable CDP is a new approach to customer data that puts your existing data infrastructure, like the Data Lakehouse, at the center of your operations. While traditional CDPs are bundled platforms with their own data storage, Composable CDPs are unbundled, giving you more flexibility in your tech stack, and allowing you to use the Data Lakehouse for data storage and modeling.
The Composable CDP on the Data Lakehouse has become a powerful and popular solution for customer marketing. This popularity has led many CDPs to market with the word “Composable,” sometimes erroneously, so it’s essential to clearly define what Composability actually means.
The Composable CDP is different from a traditional CDP in 4 key ways:
The Lakehouse Composable CDP benefits from the data investments and modeling your team is already making in the Data Lakehouse. This single source of truth and machine learning can power all business use cases. This creates a virtuous feedback cycle between business and data teams: business teams can easily use existing data and then communicate with data teams about additional models or attributes that would help further innovation. For example, Mews, which runs products used by over 3,500 hospitality brands, uses the Lakehouse to unify their disparate data into a single source of truth before powering use cases directly from it.
The Lakehouse Composable CDP’s data comprehensiveness is matched by its data flexibility. The Lakehouse can match data to whatever schema your business needs. Traditional CDPs are limited to web events and other narrow user attributes that fit into their predefined schema. The Lakehouse is better equipped for complex companies to support the correct data for their CDP use cases. For example, PetSmart runs marketing campaigns from the Lakehouse based on pets each person owns– and that “pet” entity couldn’t be supported by a traditional CDP. A traditional CDP only has data models for events and users (people), so it’s not feasible to also track each user’s multiple “pets” and their associated traits like birthdays, medications, food brands, and more.
The Data Lakehouse also excels at data governance, providing full transparency, assurance, and auditability at each step of your customer data architecture. Using a CDP powered by the Lakehouse, your data team fully controls and owns your customer data rather than delegating that ownership and power to a black box third-party system.
Building a Composable CDP architecture around the Lakehouse also ensures that you remain modular and future-proofed. If you want to change out parts of your CDP tech stack, like event collection, you can freely do so at your discretion, as your core data assets remain safe in the Lakehouse regardless of the rest of your tech stack. You don’t get locked into a monolithic CDP vendor but instead can choose the right tech provider for each CDP use case as your business evolves.
In addition, the Lakehouse Composable CDP is a stronger return on investment than a traditional CDP. You can get a far faster time to value because you work with your existing infrastructure rather than starting from scratch with a new system. This makes a Composable CDP more cost-effective. Some of this cost efficiency happens because of vendor selection: you just purchase the CDP components you need rather than buying an all-in-one platform with redundant features. You also don’t have to pay to store your data and compute in an additional redundant platform, and you benefit from the economies of scale you have with your main source-of-truth Data Lakehouse.
Hightouch and Databricks work better together and provide businesses with the best way to activate their customer data, which is why Databricks invested in Hightouch.
Hightouch provides all of the components an organization needs to compose its Lakehouse-native CDP, including:
Hightouch also offers features that traditional CDPs do not for Lakehouse users. For example, Match Booster enriches first-party data with third-party identifiers in-flight to ad platforms to increase match rates directly to Databricks customers, performing a similar role to that of a data onboarding platform like Liveramp. The Personalization API also allows websites and apps to call out to predictive models in the Data Lakehouse to power real-time personalization.
Importantly, Hightouch fully embraces the idea of a composable CDP: you can build on the Lakehouse with as many or as few of these offerings as you need. If you’d rather perform identity resolution with dbt directly in the Lakehouse, you don’t need to buy a redundant service from Hightouch. Composability means choosing your own adventure, allowing you to focus on your organization's needs to add just what you need.
Building a Composable CDP on the Data Lakehouse has never been easier. You can get started with Databricks for free and speak with Hightouch’s solution engineers to determine an implementation plan for the Composable CDP features you need.