Delta Sharing and The Emergence of the Lakehouse Customer Data Platform (CDP)
Special thanks to Caleb Benningfield and Sam Malissa at Amperity for their valuable insights and contributions to this blog.
Today, businesses face a significant challenge in handling a greater volume and complexity of customer data to power personalization at scale while also staying compliant with privacy regulations. This means prioritizing data quality and implementing an effective governance layer, but existing tools and methods that businesses used to rely on aren't up to the challenge.
To address this challenge, many businesses have transitioned from cloud data warehouses and data lakes to a data lakehouse architecture. The data lakehouse combines the best of what its predecessors had to offer, streamlining the way businesses store and manage their data and making it easier to access valuable insights.
So, what's next? The next frontier is built on Databricks with Delta Sharing, which enables secure cross-platform data sharing in real-time with no replication — helping to build a more open, flexible, and secure data ecosystem that not only enables data, analytics and AI, but also the ability to drive activation through interoperability across your data and MarTech stack. This flexible and open platform benefits both data engineers and business teams: IT doesn't need to spend time and resources maintaining connections between tools and moving data around the stack and marketers can build sophisticated customer experiences that just work.
In this blog, we'll explore how data platforms are evolving into data intelligence platforms, how Delta Sharing is helping support this evolution for all your data + AI, and how Amperity is using Delta Sharing and Databricks to build a new Lakehouse Customer Data Platform (CDP). Read on to learn how Databricks and Amperity have helped customers build and enrich their customer experiences, and how we can help your organization simplify your workflows, enhance data quality and governance, and cut down on costs.
Databricks Data Intelligence Platform
Databricks pioneered the concept of the lakehouse to combine and unify the best of data warehouses and data lakes. Today, 74% of global CIOs report having a lakehouse in their estate, and almost all of the remainder intend to have one within the next three years.
However, while lakehouse adoption has become a force in the market, our industry is innovating at an unprecedented rate, and there's been another rapidly rising technology called Generative AI. In November 2023, we introduced the Databricks Data Intelligence Platform (DI Platform). It's not just an incremental improvement over current data platforms, but a fundamental shift in product strategy and roadmap for our company. Our customers can democratize insights with natural language and build AI into their own data – thanks to the unification of the lakehouse and generative AI.
The Databricks Data Intelligence Platform is built for data sharing and collaboration. With Databricks, businesses can share all of their data and AI assets across regions, clouds, and platforms, with centralized governance so you can drive interoperability across your data and MarTech ecosystem. Whether you are a data consumer or provider, you can share data sets, AI models, notebooks, dashboards, and solutions — all powered by Delta Sharing which enables secure, real-time data sharing across diverse environments.
Delta Sharing, an open protocol for cross-platform data sharing
Delta Sharing is an open protocol for secure data sharing built by Databricks and the Linux Foundation that allows businesses to securely share live data across data platforms, clouds or regions, without the need for replication. This means no more data silos. IT teams can access data wherever it's stored without having to maintain ETLs and integrations. This is more efficient for organizations, helping to mitigate any unnecessary, exorbitant storage costs.
Governance is also centralized. Delta Sharing has built-in integration with Databricks Unity Catalog to centrally manage, audit, and track shared data across business units, or with other organizations such as partners or customers. This helps ensure data is only accessible to authorized individuals, supporting compliance with data protection regulations.
All of these capabilities are unlocking new opportunities for businesses, such as:
- Seamless sharing: Securely and quickly sharing data within business units and subsidiaries across clouds or regions without replicating data. Internal infrastructure can vary, so technology-agnostic data sharing lowers business costs.
- Easier data monetization: Distribute and monetize data products, including datasets, machine learning models and dashboards, all without the need for customers to be on your same platform.
- Greater data access, fewer integrations: Sharing data between platforms and tools will no longer require building and maintaining data pipelines. The ease of data sharing enables companies to leverage best-in-class tools rather than being locked into a specific cloud vendor platform.
The easier access and movement of data is especially important. As big data environments evolve with specialized tools available in different platforms and apps, the ease of sharing data opens up a new frontier for data management. In this context, the specific complexities of managing customer data call for a new kind of customer data platform.
The Amperity Lakehouse CDP: flexibility, data quality, and governance
The Databricks Data Intelligence Platform and Delta Sharing are driving new innovations to help fuel collaboration. Building on this platform, Amperity, a leader in the CDP space has introduced the first Lakehouse CDP.
Just as data storage evolved from the warehouse to the lakehouse, CDPs have had their own development curve. They started as all-in-one packaged offerings with end-to-end customer data workflows, which had powerful capabilities but ended up becoming one more silo. Then came the concept of the composable or unbundled CDP, which involved taking only the components and tools that a company needed and running them directly off the data warehouse. This offered flexibility and customizability, but too often didn't account for the challenges of maintaining data quality or governance.
The Lakehouse CDP offers the best of both approaches. It is a composable solution that can access and share live lakehouse data across an ecosystem without replication. Instead of relying on complex business logic, a lakehouse CDP unifies and enriches customer data in a lakehouse for activation, analytics, and AI use cases, without code. Delta Sharing is the key capability that enables these core benefits of the Lakehouse CDP.
Amperity uses Delta Sharing through a new feature called Amperity Bridge. Bridge allows its users to point and share data to and from Databricks without copying or moving data across platforms. With live data available across the tech stack through a shared catalog, businesses gain greater visibility and control over their data and better compliance without unnecessary network calls and processing.
Here's a closer look at what the Amperity Lakehouse CDP makes possible:
- Better data quality: Automated first-party identity resolution uses AI to unify raw customer data and produce a stable, universal identifier. This maintains high data quality in the lakehouse and across tools.
- Faster data modeling: Teams can quickly shape data for activation. Pre-built data assets by industry and use case can be easily shared and enriched in a lakehouse.
- Easier activation: Smooth access and activation of high-quality data from a lakehouse with marketer-friendly tools is a lighter lift and more secure than typical reverse ETL.
- Stronger data governance: Governance improves through securely sharing data without replication, tracking every transformation, and automating consent management workflows.
Delta Sharing is an open approach for maximal reach and impact
Delta Sharing's secure, cross-platform approach to data sharing has transformed the application development landscape and changed the way businesses approach their tech stack. Data engineers no longer need to spend time and resources building custom, secure connections between tools and manually moving data around the stack. They can instead focus on which tools to use to enhance their data to help fulfill business needs, unlock new AI use cases, and further shape innovation.
Platforms like Amperity have already adopted Delta Sharing to more easily provide best-in-class identity resolution, enrichment, and activation tools to any data lakehouse. As we move further into the data + AI sharing revolution, we're excited to see how this new freedom inspires creativity and innovation among more companies and brands alike to drive open collaboration with any data type across clouds and platforms.
Learn how General Motors amplifies their Customer 360, powered by Amperity and Databricks to help drive meaningful business impact for GM's business and customers. "Elevating Customer Loyalty: How GM amplifies their C360 with Amperity" is a customer-led session at the upcoming Data + AI Summit 2024 in San Francisco this June.
Amperity hosted their annual summit, Amplify 2024 in New York City on May 16, 2024. Learn more about the event and how to watch the on-demand sessions.
To learn more about how Delta Sharing can help your organization, check out the latest resources including new eBooks and related blogs below.