Winning on and off the field with cutting-edge baseball analytics
Cincinnati Reds drive faster insights and smarter decisions with Databricks
Improvement in latency with serverless Databricks Workflows
Of hours of compute time saved daily
Decrease in pipeline runtime

In the high-stakes world of major league baseball, every decision and action can be the difference between winning and losing. Understanding how real-time data can provide a competitive advantage, the Cincinnati Reds looked to modernize their legacy infrastructure that was complex to maintain at scale. To speed up their workloads and achieve greater operational efficiency, they prioritized serverless capabilities as part of their transformation strategy. By adopting the Databricks Data Intelligence Platform, they have completely transformed how they manage and analyze large datasets. With Databricks Workflows, processes are automated and data pipelines are delivering real-time insights that drive smarter, faster decisions — giving players and coaches the tools they need to win at the highest level.
Legacy systems strike out on delivering timely insights
The Cincinnati Reds, one of Major League Baseball’s oldest franchises, faced legacy infrastructure and data challenges hindering their ability to make smarter, faster decisions on baseball feedback loops to win more baseball games. Over the past decade, data stream volume and complexity had grown by up to 100x due to the rise in data generated by Internet of Things (IoT) devices and sensors that track player performance, in-stadium activity and fan engagement, resulting in significant challenges in handling and processing data in their on-premises systems. Processing billions of rows and queries often took hours or even days, making real-time applications impractical. Scalability was limited, forcing manual troubleshooting that consumed valuable time.
Latency had also become a critical factor, with users expecting immediate access to insights and reports, which were previously only available after long delays. The data also needed to be accessible and usable for different roles, from data scientists who process large amounts of data and train models to help the team make predictions to less technical users within the business who just need small pieces of information to help drive strategic decision-making. Additionally, application developers require programmatic access to the data stores.
“A lot of the stuff we were doing before was just troubleshooting issues like missing rows or unprocessed data. We’d spend hours just fixing those problems instead of actually driving insights or improving performance. Our team couldn’t keep up with the increasing volume and variety of data, and this manual approach left us at a disadvantage, especially with the demand for real-time insights in a sport that moves as fast as baseball,” Bryce Dugar, Data Engineering Manager at the Cincinnati Reds, explained.
The Cincinnati Reds realized they needed a more scalable and efficient solution as their data needs rapidly expanded. Recognizing the limitations of their traditional systems, the Reds adopted the Databricks Data Intelligence Platform to automate workflows and enable real-time data access. Specifically, with Databricks Workflows — the unified orchestration tool for data, analytics and AI — they sought to more easily define, manage and monitor jobs with multiple tasks for ETL, reduce manual intervention and deliver more timely insights for improved decision-making.
Unifying data orchestration with Databricks Workflows
With the Databricks Platform serving as a core piece of their data infrastructure, Databricks Workflows became central to solving the Reds’ challenges by tailoring orchestration to meet their specific needs and integrating seamlessly via APIs. This enabled triggering jobs with parameters, allowing greater workflow control. Analysts gained independence to develop and execute notebooks with improved traceability and iterative processing capabilities.
The team also moved to a serverless architecture, which significantly reduced latency. Previously, using job clusters, even with pools, often led to delays while waiting for jobs to spin up.
“With serverless Databricks Workflows, we’ve achieved a 3–5x improvement in latency. What used to take 10 minutes now takes just 2–3 minutes, significantly reducing processing times. This has enabled us to deliver faster feedback loops for players and coaches, ensuring they get the insights they need in near real time to make actionable decisions,” Bryce said.
In addition to these time savings, the team saw a 65–80% reduction in VM costs by transitioning to serverless Databricks Workflows, making the solution not only faster but also significantly more cost-efficient. With the Reds running 15,000–20,000 workflow steps daily, this efficiency saved hundreds of thousands of compute minutes each day, enabling new workloads and faster report delivery.
The team transitioned from Azure Data Factory to Databricks. “With Databricks, we’ve completely transformed how we handle data. Moving to a serverless architecture and leveraging serverless Databricks Workflows has not only streamlined our pipelines but also dramatically reduced latency by up to 83%,” Bryce said. “What used to take an hour can now be done in 10 minutes, allowing us to run more processes, deliver reports faster and provide near-instant feedback for coaching and player development. It’s a game changer for how we make real-time decisions in a fast-paced sport like baseball.”
The platform’s serverless environment removed constraints related to memory and compute cores. This allowed the team to handle large, complex data queries more efficiently, ultimately supporting faster and more accurate insights for game-day decisions. The traceability of workflows from data ingestion to the final output also improved accountability, making it easier for the team to monitor, troubleshoot and optimize their data pipelines. Data scientists and engineers no longer face system instability, session crashes or lengthy processing times, enabling smoother workflows and higher job satisfaction. Furthermore, the modular, automated nature of Databricks Workflows significantly lowered the overhead for developers, freeing them up to focus on high-value work.
Faster data serves up game-winning decisions
Transitioning to serverless compute significantly reduced job latency, cutting pipeline steps from 10–15 minutes to as little as 2–3 minutes. With Databricks Workflows streamlining processes, ETL tasks and data science workflows now operate with enhanced speed and agility.
This efficiency ensures rapid post-game data availability, empowering coaches with near real-time insights to make critical adjustments and provide immediate feedback to players.
“With Databricks, the time savings we’ve realized is almost immeasurable — enabling things we couldn’t even consider before. That shift has fundamentally changed how we operate and how quickly we can deliver insights to coaches, players and analysts,” Bryce concluded. This transformational shift has solidified Databricks Workflows as a critical enabler of the Reds’ data-driven strategy, empowering the team to make smarter, faster decisions in the high-stakes world of professional baseball.