Databricks on Alibaba
Databricks Data Insight is a fully managed platform for data and analytics based on Apache SparkTM. DataInsight is built on the Databricks Runtime and Delta Lake. Integrated with Alibaba Cloud services, it ensures data security and allows you to configure monitoring and alert policies, as well as dynamic cluster scaling. It meets the analytics needs of data analysts, data engineers, and data scientists.
Databricks Runtime provides a 50x improvement over open-source Apache SparkTM
Streaming & Batch integration
Databricks Delta Lake provides ACID transaction capabilities for data lake analytics, processing both batch and streaming datasets
Databricks DataInsight meets the analytics needs of data scientists, data engineers and business analysts, and provides an interactive and collaborative Notebook environment
Real-time Data Insight
Separate computing and storage reduces data redundancy and enables data access for multiple audiences, reducing data storage costs, and providing independent scalability
A fully managed analytics platform
Quickly start up fully managed clusters with simple operation and pay for only what is used.
Set the number of nodes according to job needs, with high availability cluster support.
Supports three instance type families of ECS general type, computing type and memory type.
Interactive collaborative work
Multiple user roles share data and collaborate interactively.
A collaborative work space that provides interactive job execution mode, supports Apache Spark, PySpark, Spark R and Spark SQL jobs, with visual display of analytics results.
Meta-information of databases and tables can shared between clusters without duplication.
Fully compatible with Apache Spark ecosystem
100% compatible with open source Apache Spark.
Performance optimized Databricks runtime based on Apache Spark. I/O optimized for Alibaba Cloud OSS, providing a faster and more efficient analytics engine.
Databricks Delta Lake
An optimized version of Delta Lake integrated with Alibaba Cloud Services.
Integrated with Alibaba Cloud RAM to control permissions based on users and roles to ensure data security.
Big Data Analysis Engine That Unifies Batch and Stream Processing
Deeply integrated with Alibaba Cloud services and features, such as the data governance and data lineage of DataWorks and Machine Learning Platform for AI (PAI), to provide a more comprehensive data solution