Today we are excited to announce the completion of the first phase of the Databricks Enterprise Security (DBES) framework. We are proud to say that this makes Databricks the first and only company to provide comprehensive enterprise security on top of Apache Spark.
Hundreds of organizations have deployed Databricks to improve the productivity of their data teams, power their production Spark applications, and democratize data access. As Databricks continues to gain adoption across security-minded industries such as financial services and healthcare, we are also focused on enabling them to maximize the value from their data while satisfying strict security and compliance requirements in their respective industries (such as Sarbanes-Oxley or HIPAA).
Traditionally, enterprise organizations only had security solutions that addressed parts of their big data infrastructure. Today, enterprises demand holistic security that covers the full spectrum of their big data lifecycle: from file processing, big data clusters, code management, job workflows, application deployments, dashboards, to reports.
The Databricks just-in-time data platform takes a holistic approach to solving the enterprise security challenge by building all the facets of security — encryption, identity management, role-based access control, data governance, and compliance standards — natively into the data platform with DBES.
In short, DBES will provide holistic security in every aspect of the entire big data lifecycle.
DBES builds upon the extensive Databricks access management and encryption functionalities that already exist. With the completion of DBES Phase One today, enterprises gain the ability to control access to Apache Spark clusters on an individual basis, manage user identity with a SAML 2.0 compatible identify management provider service, and end-to-end auditability.
The Cluster Access Control Lists, or cluster ACLs, gives Databricks administrators the ability to fine-tune the autonomy of Databricks users based on the enterprise security policy. For example, one can strictly limit the ability to launch new clusters to control costs while giving teams the complete freedom to run code on existing clusters in a self-service manner.
Specifically, an administrator will be able to define whether users are allowed perform the following actions on an individual basis:
Enterprises will now be able to use a SAML 2.0 compatible identity provider to authenticate and authorize access to the Databricks platform. Since many enterprises already utilize an identity provider service, and virtually all major identity providers (e.g., Okta, PingIdentity) support SAML 2.0, this will vastly simplify the setup and management of accounts on the Databricks platform. Databricks users will also enjoy a more streamlined login process, as now they can log into the platform with a single click instead of having to remember (and possibly recover) passwords.
The audit logs will provide enterprises in security-conscious industries such as healthcare or financial services the tools to satisfy strict compliance requirements, such as HIPAA or Sarbanes-Oxley. The Databricks audit logs are a comprehensive record of activity on the platform, allowing enterprises to monitor the detailed usage patterns of Databricks as the business requires. This allows a central authority to easily reconstruct critical events with:
These logs are stored in a human readable format so one can explore the logs easily, the administrator can also analyze the information in the audit logs using the Databricks platform itself.
Databricks’ vision is to empower anyone to easily build and deploy advanced analytics solutions. With the Databricks Enterprise Security Framework, Databricks can satisfy the diverse (and sometimes competing) needs to secure big data in the modern enterprise, end-to-end. Phase One is only the beginning, stay tuned for more advances in the near future.
Interested in securing your Apache Spark workloads with Databricks? Test drive the platform with a free trial or contact us for a personalized demo.