Seamless Integration of Databricks Unified Analytics Platform and RStudio Server Makes R More Accessible and Scalable
San Francisco, CA –June 27, 2018 – Databricks, the leader in unified analytics and founded by the original creators of Apache Spark™, today announced a partnership with RStudio, providers of a free and open-source integrated development environment for R, to increase the productivity of data science teams. The partnership will allow the two companies to seamlessly integrate Databricks’ Unified Analytics Platform with the RStudio Server, simplifying R programming on big data. The RStudio and Databricks integration removes the barriers that stop most R-based machine learning and artificial intelligence (AI) projects.
Hundreds of organizations are leveraging Databricks’ Unified Analytics Platform as a simplified approach for data science and data engineering teams to unify data processing with AI technologies. Unified analytics solutions provide collaboration capabilities for data scientists and data engineers to work effectively across the entire development-to-production lifecycle. Data science teams can use a range of languages in Databricks’ Unified Analytics Platform and R is increasingly popular for advanced statistical analysis.
RStudio provides the most popular way for data science teams to analyze data with R through open source and enterprise ready tools for the R computing environment. By integrating both solutions, data scientists can easily use RStudio from within a Databricks implementation. Data science teams are better positioned to collaborate with data engineering and lines of business to accelerate AI initiatives.
“Unifying data with machine learning continues to be the biggest barrier when building machine learning models. Data science teams use so many technologies and systems to manage data and machine learning – working in silos and hindering the iterative process needed to achieve AI,” said Michael Hoff, senior vice president of business development and partners at Databricks. “Our technology integration with the RStudio Server eliminates the need for data teams to spend valuable time ramping up on new tools. Data scientists can leverage a familiar IDE, quickly access and prepare high quality data sets, and automatically run and execute R workloads at unprecedented scale.”
By leveraging the joint solution, data science teams can experience:
- Increased productivity among data science teams. The seamless integration of both solutions allows data scientists to use familiar tools and languages to run and execute R jobs on Databricks’ Unified Analytics Platform directly in RStudio IDE.
- Simplified access to large data sets. Remove barriers to most R-based machine learning and AI projects by bringing the datasets together in Databricks’ Unified Analytics Platform with the ability to code in RStudio. Databricks provides scalable data processing to clean, blend, and join datasets with optimized data format.
- Distributed R computing at scale. Databricks supports R as a first-class language, offering unprecedented performance as well as the ability to auto-scale cloud-based clusters to handle the most demanding jobs, while keeping the total cost of ownership low.
“Databricks and RStudio share the same mission to make data science teams more productive,” said
Tareef Kawaf, president of RStudio. “We’re confident they will appreciate having the combination of Apache Spark and the RStudio Server, or their own RStudio Server Pro, ready to go in the Databricks Unified Analytics Platform.”
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks’ founders started the Spark research project at UC Berkeley that later became Apache Spark. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP. For more information, visit www.databricks.com.
Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.
Head of Communications