Databricks Ventures
Investing in the future of data, analytics and AI
Databricks Ventures invests in innovative companies that share our view of the future for data, analytics and AI. Our focus is on supporting early- and growth-stage companies that are empowering AI in innovative ways on top of or alongside the Databricks Data Intelligence Platform. These companies share our vision for an open ecosystem and our commitment to harnessing the power of data intelligence to create the next generation of data and AI-powered companies.
Portfolio company benefits
Exclusive insight into the product roadmap and support for building deep technical integrations
Guidance and best practices from the network of Databricks mentors
Broader reach by partnering with Databricks go-to-market programs
Our investment focus
A shared vision to build the future of data, analytics and AI leveraging the power of data intelligence
Investments in early- to growth-stage companies in partnership with a lead investor
Projects and entrepreneurs that share Databricks’ commitment to building on open platforms
Our Portfolio
Alation is the leader in enterprise data intelligence solutions. It helps organizations transform raw data into actionable insights with its data intelligence platform that empowers users to find, understand, govern and use data collaboratively.
Anomalo provides automated AI to detect data quality issues and understand their root causes before anyone else. Customers can get ahead of data issues by automatically detecting them as soon as they appear. Anomalo can detect, root cause, and resolve issues quickly — allowing businesses to feel confident in their data.
Arcion is a cloud-native, distributed CDC-based data replication platform that makes building real-time data pipelines simple. Arcion helps enterprises eliminate brittle pipelines and data silos.
Acquired by Databricks
Catalyst is the leading platform to drive growth through your customers. Trusted by top revenue leaders at global B2B brands, Catalyst guides sales and success teams to turn customers into your number one engine for growth.
Acquired by Totango
Celebal Technologies is a Databricks Global Elite consulting partner delivering data and AI solutions across industries with deep expertise in industry co-pilots, SAP data modernization and generative AI.
Pioneered at MIT and proven at over 10% of Fortune 500 companies, Cleanlab is the fastest and easiest way to automatically fix issues (label errors, ambiguous data, outliers, duplicates, etc.) and improve the reliability of enterprise AI and analytics solutions built on error-prone, real-world data. Data is the currency of enterprise AI: Cleanlab increases the value of data.
Cube's universal semantic layer, Cube Cloud, helps companies bring consistency, context and trust to business data. Any data source can be unified, governed, optimized and integrated into any data application: internal, external, human or bot-facing.
dbt is a data transformation framework. Users can work directly within their data lakehouse or warehouse to quickly produce trusted data sets for reporting, ML modeling and operational workflows.
Entrada empowers Databricks customers to unlock their data’s full potential by offering expert services in modernizing data platforms to drive business objectives and monetize data-centric solutions.
Galileo is the leading platform for GenAI evaluation and observability. Powered by Evaluation Foundation Models, Galileo supports AI teams across the AI lifecycle — from offline development to online monitoring and protection.
Glean is the AI-powered work assistant that connects and understands all your enterprise knowledge to bring you the answers you need.
Hex is a platform for collaborative analytics and data science. Users can connect to data, analyze in a collaborative SQL and Python-powered notebook, and share work as interactive data apps that anyone can use.
Hightouch is a data activation platform that helps organizations unify customer profiles in the lakehouse, and activate audiences across marketing channels. Together, Hightouch and Databricks form a Composable CDP.
The Hunters SOC platform empowers security teams to automatically identify and respond to incidents that matter, helping teams mitigate real threats faster and more reliably than SIEMs.
The Immuta Data Security Platform enables organizations to unlock value from their cloud data by protecting it and providing secure access. The platform provides sensitive data discovery, security and access control, and data activity monitoring.
Labelbox is building a collaborative training data platform that makes it easy to create and manage labeled data, enabling rapid deployment of AI applications.
Lovelytics, a leading Databricks consulting partner, provides enterprise data platform design, data science, cloud, data visualization, and AI services to enable organizations with better, deeper, and faster insights.
Matillion is the data productivity cloud that gets data business-ready, faster — with enterprise-scale load, transform, sync and orchestration — for insights, analytics, data science, machine learning and AI.
Mistral AI is a European startup with global ambitions to be a pioneer in the realm of generative artificial intelligence. The company is committed to making generative AI more open, portable, independent and accessible to all.
Neon is a modern, developer-friendly Postgres built for the cloud. Neon separates storage and compute to efficiently autoscale projects, support database branching and provide “bottomless” storage.
Perplexity advances the way people discover and share information using AI-powered search — providing instant answers and information on any topic, with up-to-date sources to help people discover, research and learn faster.
Prophecy is a low-code data engineering platform designed to make more data users enabled and productive on Databricks with complete lakehouse support.
Revelate’s data fulfillment platform provides a suite of capabilities for data sharing and data commercialization for customers to fully realize the value of their data.
Snowplow’s Behavioral Data Platform enables organizations to generate first-party customer data with real-time event streaming into the lakehouse for machine learning-powered personalization and customer 360 applications.
Tecton, the machine learning feature platform company, enables data teams to build, centralize, share, and serve production-ready ML features for offline training and online inference, at scale.
Unstructured transforms organizations’ complex, unstructured data like PDFs, PPTX, HTML files and more into formats compatible with LLMs so employees can chat with their internal data.
Voyage AI builds best-in-class embedding models and rerankers for accurate and efficient search and RAG. Developed by top researchers, Voyage AI’s models and tools outperform in accuracy, latency and costs with smaller vectors and flexible licensing.
XponentL is the leader in data products, driving insights and enabling AI in complex environments.
FAQs
Build your startup on Databricks
Databricks for Startups offers free credits, expert advice and go-to-market support.