"Building vehicles that are more like smartphones is the future. We're about to change the ride just like Apple and all the smartphone companies changed the call."— Jim Farley, CEO, Ford Motor Company
Jim Farley's analogy of cars as smartphones is the reality for every automotive company. Modern cars generate over 1,000 times more data every day through multiple sensor modalities, as many as 150 electronic control units (ECUs), and over 100 million lines of code. With the growth in connected vehicles (95% of new vehicles sold globally by 2030), it is a strategic imperative for every automotive company to monetize connected vehicle data and drive differentiation with more personalized services, value-added digital offerings, and ecosystem monetization.
The size of the pie for monetization of connected vehicle data is enormous. By 2030, on average, new subscription-driven services can generate incremental recurring revenue of $310 per vehicle per year. These services are also vastly more profitable - average operating margins are 150% higher than new unit sales- and more importantly, increase stickiness with drivers by offering them better safety, comfort, convenience, and entertainment outcomes.
To take more than their fair share of this huge opportunity in the autonomous, connected, and electric mobility revolution, automotive companies need a more comprehensive data strategy that can address the volume, complexity, interoperability, democratization, and monetization of valuable information received from connected vehicles.
Vehicle telemetry: navigating multiple modalities of value creation
There's no shortage of data from today's vehicles - this is where all the data gravity is in the entire industry. What separates winners and losers in this space comes down to one simple distinction - Automotive OEMs and Mobility companies that can effectively take away the complexity from vehicle telemetry data and enable a wide range of use cases, and those who aren't able to do so effectively.
While the origins of vehicle telemetry data lie in safety, it is now a critical component of delivering vehicle occupants more comfortable, more convenient, and more entertaining experiences. With the explosion of data, the breadth of use cases will grow exponentially and only be limited to human imagination. Companies that understand this well are able to design data platforms to enable many downstream use cases in different departments, making the enterprise way more effective.
What that means for the future is that while vehicle telemetry data is created on the vehicle, its value is realized across multiple modalities, spanning different departments, functions, and even external parties. A few important examples of the use of vehicle telemetry data across the organization and ecosystem:
- Marketing: harnessing continuous information on vehicle usage to design personalized service packages, and position more compelling offers and complementary solutions such as insurance, warranties, digital service subscriptions etc.
- Digital Experiences: leverage vehicle insights to drive hyper-personalized and delightful web and mobile experiences for customers.
- Customer Support: leverage vehicle diagnostics and sensor information to faster insight into field issues, and warranty claims, identify potential corrective actions, and bring resolutions to customers faster.
- Design & Engineering: understand software feature usage and improve driving experience with over-the-air updates to safety, autonomy, connectivity, battery, infotainment, and control systems.
- Dealers/Service Networks: predict maintenance and aftermarket needs and drive seamless fulfillment to improve vehicle performance and ownership experience.
- Product Quality: Improve traceability between customer complaints and field issues to manufacturing processes and suppliers and avoid future recalls.
- Ecosystem Monetization: Increase value capture through an ecosystem of infotainment services, electrification, insurance, and shared mobility services.
The common theme across all the use cases is that telemetry data enriches every insight by making it more relevant and actionable. It not only enables predictive capabilities and quicker data-driven decisions, but it also makes it easy to put insights in the hands of the right people, who can orchestrate the right solution at the right place and the right time. This requires a thoughtful approach to the democratization of information that ensures that everyone, regardless of technical skill, can access data and drive the most effective and value-acretive actions.
The Roadblocks
As automotive players strive to harness the power of connected vehicles, they are confronted with several challenges including complex data integration and standardization, security and governance, and data and organizational silos.
Complex Data Integration and Standardization
Connected vehicles generate an immense volume of data, often in diverse, complex, and even proprietary formats. Harmonizing this complex web of information across vehicle components poses a formidable challenge, and modeling it in a way that is approachable across various business units and/or vendors can be daunting. With 100s of millions of connected vehicles on the road today, standardization is the key to unlocking the full potential of this data, enabling seamless collaboration amongst different stakeholders, interoperability amongst varying use cases, and contextualization with other data sets (such as digital interactions, dealer networks, manufacturing and engineering data).
Security and Governance
With great data comes great responsibility. The sensitive nature of telemetry data (including vehicle location, vehicle identification, and PII) demands robust security measures and governance frameworks to ensure privacy and compliance. Safeguarding this wealth of information with encryption, masking, row/column level controls, geographic data residency, etc. are all challenges that manufacturers are likely to have to overcome with telemetry data.
Data and Organization Silos
Adopting a data-driven culture is not just a technological shift; it's a holistic transformation that demands the democratization of data for non-technical users and fosters seamless data collaboration, both internally and externally. Unfortunately, data silos and organizational challenges present significant hurdles to this transformation, hindering the ability to move and innovate swiftly and deliver data and insights to the right place and people at the right time. In many cases, valuable data remains trapped within departmental silos, inaccessible to those who could leverage it for strategic decision-making and innovation. This lack of cross-functional collaboration stifles innovation and hinders the agility required in today's fast-paced automotive landscape. By democratizing data, empowering non-technical users with intuitive tools and access, and fostering a collaborative culture that encourages data sharing internally and externally, organizations can break down these silos and unlock the true potential of their data to drive informed decision-making, innovative solutions, and ultimately, success.
Building a Comprehensive Data Strategy
There are some foundational elements that a data and AI platform for connected vehicle data should include to overcome this tough terrain. A lakehouse architecture addresses the intricacies of democratizing data and AI for vehicle telemetry with three critical characteristics:
Consistent Ingestion and Processing
A modern data and AI platform provides consistent ingestion and processing for data of any format, speed, and size. Whether it's real-time telemetry streams or historical data, the platform provides automatic incremental ingestion and processing capabilities.
This makes it easier to go from raw, less structured data into more and more curated data sets (medallion, bronze > silver > gold, etc.) to serve different teams and data products. With vehicle telemetry data, this often means going from highly nested sensor readings across various components in the Bronze table into long, skinny key-value (vehicle, timestamp, sensor-name, sensor-value) silver tables, and finally into tables that are aligned to different data teams, business processes, or data products. These gold tables often include pivoted and/or aggregated values from the silver, key-value tables.
Open, Efficient Storage
To handle the sheer volume and velocity of telemetry data, the platform boasts efficient, open table-format (Delta Lake) storage in cheap, resilient cloud object storage. Delta Lake mixes efficient ACID transactions (insert, delete, update, merge, etc.) with change-data-capture (CDC), data versioning, and time travel providing the full ability to audit. Being open-sourced makes it accessible across most modern compute engines reducing lock-in and driving optionality across tools and vendors. This provides a single source of truth for all data to be used in downstream data and AI products, enabling data engineers, data scientists, and analysts to be 40-65% more productive.
Unified Governance, Security, and Integration
The linchpin of this solution lies in its ability to offer unified governance, security, sharing, and integration. By centralizing these critical aspects, the platform not only ensures the protection and compliance of data but also drives optionality in how data products are built. Telemetry data owners can control how data is modeled, secured, served, etc. to federated data teams that want to consume telemetry data and build their own data and AI products with it. This flexibility empowers manufacturers to tailor data solutions to their specific needs, fostering a culture of innovation.
The Data Intelligence Platform infused with Generative AI
The Databricks Lakehouse brings together these pillars of consistent ingestion and curation for all data into an open, efficient, and governed lakehouse. It also provides a place for distributed teams to develop and share data and AI products on top of the governed telemetry data securely and compliantly.
When Generative AI is brought to the Lakehouse, you get a new level of data intelligence. The Databricks Data Intelligence Platform includes an intelligence engine that uses Generative AI to understand the characteristics and semantics of your data. This is used to optimize the performance, cost, and experience throughout the platform. Governed data is further democratized from front-line workers to the C-suite with native natural language interfaces and assistants. Finally, the Data Intelligence Platform provides the tools, patterns, and models to build your own Generative AI applications directly on your data.
If done right, this strategy will help automotive OEMs and mobility companies find more users, especially non-technical users who can interact with data with natural language interfaces and make better decisions. Examples of this could include software engineers who want to understand how connected features are performing with end-users, mechanical engineers who want to understand the reliability of electro-mechanical systems, electrical engineers who want to understand trends of battery performance and EV charging experience, and marketing professionals who want to personalize their communications to customers.
To learn more about governance, generative AI and the Databricks DI platform, please leverage the following resources: