Skip to main content
Matei Zaharia

Matei Zaharia

Follow Matei Zaharia

Matei is the CTO and co-founder of Databricks and an Associate Professor of Computer Science at UC Berkeley. He started the Apache Spark project during his Ph.D. program at UC Berkeley in 2009 and has worked on other widely used data and AI software, including MLflow, Delta Lake, and DBRX. His most recent research is about combining large language models (LLMs) with external data sources, such as search systems, and improving their efficiency and result quality. Matei’s research was recognized through the 2014 ACM Doctoral Dissertation Award and the U.S. Presidential Early Career Award for Scientists and Engineers (PECASE).

Matei Zaharia's posts

Blog: Realizing the Lakehouse Vision: Open Storage, Open Access, Unified Governance

Announcements

December 2, 2025/6 min read

Completing the Lakehouse Vision: Open Storage, Open Access, Unified Governance

Lakebase

Announcements

June 11, 2025/5 min read

What Is a Lakebase?

MLflow 3.0

Announcements

June 11, 2025/12 min read

MLflow 3.0: Build, Evaluate, and Deploy Generative AI with Confidence

What's new in Mosaic AI

Announcements

June 11, 2025/6 min read

Introducing Agent Bricks: Auto-Optimized Agents Using Your Data

Databricks and Neon

Announcements

May 14, 2025/4 min read

Databricks + Neon

The Power of Fine-Tuning on Your Data

Mosaic Research

April 8, 2025/9 min read

The Power of Fine-Tuning on Your Data: Quick Fixing Bugs with LLMs via Never Ending Learning (NEL)

Benchmarking Domain Intelligence

Mosaic Research

December 17, 2024/12 min read

Benchmarking Domain Intelligence

AI Agent Systems: Modular Engineering for Reliable Enterprise AI Applications

Generative AI

November 12, 2024/7 min read

AI Agent Systems: Modular Engineering for Reliable Enterprise AI Applications

Header graphic for long context RAG part 2

Mosaic Research

October 8, 2024/10 min read

The Long Context RAG Capabilities of OpenAI o1 and Google Gemini

Generating Coding Tests for LLMs: A Focus on Spark SQL

Data Engineering

October 2, 2024/10 min read

Generating Coding Tests for LLMs: A Focus on Spark SQL

Mosaic Research

August 12, 2024/19 min read

Long Context RAG Performance of LLMs

enhancing LLM-as-a-Judge with Grading Notes OG

Generative AI

July 22, 2024/7 min read

Enhancing LLM-as-a-Judge with Grading Notes

Showing 1 - 12 of 20 results