Streaming on the Databricks Data Intelligence Platform helps Statsig accelerate product growth
Events processed every day and scaling 2x every 2 months
Rate of developer agility, unparalleled time to value
Time to product launch
Statsig helps developers make better decisions by introducing end-to-end observability for each product update they launch. From system performance to customer behavior, Statsig hands developers and their teams the data they need to make growth a fundamental capability of their business. From the company’s start in early 2021, Databricks has been Statsig’s trusted strategic partner for processing 10B+ data events each day. By automatically running A/B tests for each feature gate on any device, in any part of the application stack, at any scale, customers can validate impact on customer behavior and application usage so they never have to fly blind on how a new feature will perform again. With the help of streaming data pipelines on the Databricks Data Intelligence Platform, this 360° view of core business metrics empowers Statsig customers to 10x the number of real-time experiments they run and the successful features they deliver.
Helping customers to build and grow quickly
Whenever Statsig onboards a new customer, it ingests the full volume of that company’s raw event streams — adding massive amounts of real-time data to its existing volume every day. Since its start in early 2021, Statsig has grown exponentially, which makes keeping up with the demand to quickly scale its primary challenge. Streaming is the modern paradigm for processing large data volumes, and Statsig has relied on Databricks streaming data pipelines from the beginning. “Scaling is a big issue for us, and Databricks is particularly good at that,” says Pablo Beltran, Software Engineer at Statsig.
As a new startup, Statsig used Databricks to establish infrastructure, write code and iterate quickly. With Databricks, Statsig developers and data scientists could run an ad hoc query just as easily as they could run and test notebooks — all with the ability to access ad hoc mode and production mode in the same place. “We shipped our product in 4 months [from the time the company was founded]. We could not have done that without Databricks,” says Timothy Chan, Data Science Lead at Statsig.
As for Statsig customers, there are two ways Databricks is used to up-level analytics and help reach product goals. One is experimentation. “For customers who have never run experiments before, there is an ‘aha’ moment of, ‘This was so easy; I can’t believe I can see and understand what is going on with my users.’ Once they are able to see how their users are being affected, they conceive even more ideas for experiments they can run,” says Chan.
Those who already know the value of experimentation realize what Statsig can help them do. “Their big insight is that, because we make it so easy to set up experiments and read results, they can 10x their experimentation velocity in just a few months,” adds Chan. At the core of this velocity is the timeliness and reliability of the data flowing through streaming data pipelines built on the Databricks Data Intelligence Platform. Naturally, real-time data is key to successful experimentation. The more real-time Statsig’s data pipelines are, the more real-time its customers can make feature adjustments that maximize their own users’ experience.
For new engineers who join the Statsig team, ramping up on Databricks is faster and simpler than other solutions. Even if they’re not data scientists or data engineers, learning how to operate Databricks to the point of making useful contributions is easy. Even with no previous knowledge, it takes only a month or so for new Statsig engineers to start fixing, optimizing and keeping track of streaming pipelines. Beltran explains, “With the Databricks Data Intelligence Platform, the actual streaming mechanics have been abstracted away; I can write batch pipelines and Databricks turns that into a streaming pipeline. This has made ramping up on streaming so much simpler.”
Databricks helps Statsig move fast
Statsig helps developers move fast. With Statsig, customers can create experiments within minutes and get a full picture of how users are responding at a speed like never before. Databricks enables Statsig to move as quickly as their customers need them to. Bootstrapping clusters is simple; without complexities endemic to other platforms, engineers using Databricks can write a notebook, configure the cluster and launch into production mode easily. “Without Databricks, the development cycle of writing and testing our code would be really long. We would have to send code to a Spark server, run that code, wait for it to finish and read the result. It would probably fail 30 times and each cycle would take 10 to 15 minutes,” says Beltran. “On Databricks, you can test the code while you’re writing the notebook. The velocity of that is immense. There’s literally a button where you can create a job from the notebook itself. Within two minutes you can have that job running production-level data for further testing.”
Part of moving quickly means cutting out the middleman. Statsig data scientists can write production-level code themselves using Databricks as UI. This is critical for Statsig in particular because the features they are building, including their stats engine, are heavily influenced by data science and advanced statistical methodologies. The team’s ability to get code directly into production reduces the time and cost of their development cycle.
Databricks empowers exponential growth
Using Databricks, Statsig can stream data faster, which means it ingests data faster, starts jobs earlier and lands jobs on time. Databricks also helps Statsig keep pace with its massive data volume. “We charge our customers by the amount of data they send us, so the more data that Databricks can support, the better,” says Chan.
These aren’t small jobs. Statsig’s data pipelines ingest more than 10 billion events a day. Its systems are certified for the highest compliant standards in the industry to manage and secure data, and they serve those billions of user interactions at 99.9% availability with built-in redundancy. On top of that, that number has been doubling every two months.
Streaming on the lakehouse has also made it easier for Statsig to manage transient failures because it will pick up where it left off, rather than requiring manual intervention. “Databricks streaming is a lot more hands off and requires a lot less support than other jobs,” says Chan.