Skip to main content
Login
      • Discover
        • For Executives
          • For Startups
            • Lakehouse Architecture
              • Mosaic Research
              • Customers
                • Featured Stories
                  • See All Customers
                  • Partners
                    • Cloud Providers
                      Databricks on AWS, Azure, GCP, and SAP
                      • Consulting & System Integrators
                        Experts to build, deploy and migrate to Databricks
                        • Technology Partners
                          Connect your existing tools to your Lakehouse
                          • C&SI Partner Program
                            Build, deploy or migrate to the Lakehouse
                            • Data Partners
                              Access the ecosystem of data consumers
                              • Partner Solutions
                                Find custom industry and migration solutions
                                • Built on Databricks
                                  Build, market and grow your business
                                • Databricks Platform
                                  • Platform Overview
                                    A unified platform for data, analytics and AI
                                    • Data Management
                                      Data reliability, security and performance
                                      • Sharing
                                        An open, secure, zero-copy sharing for all data
                                        • Data Warehousing
                                          Serverless data warehouse for SQL analytics
                                          • Governance
                                            Unified governance for all data, analytics and AI assets
                                            • Real-Time Analytics
                                              Real-time analytics, AI and applications made simple
                                              • Artificial Intelligence
                                                Build and deploy ML and GenAI applications
                                                • Data Engineering
                                                  ETL and orchestration for batch and streaming data
                                                  • Business Intelligence
                                                    Intelligent analytics for real-world data
                                                    • Data Science
                                                      Collaborative data science at scale
                                                    • Integrations and Data
                                                      • Marketplace
                                                        Open marketplace for data, analytics and AI
                                                        • IDE Integrations
                                                          Build on the Lakehouse in your favorite IDE
                                                          • Partner Connect
                                                            Discover and integrate with the Databricks ecosystem
                                                          • Pricing
                                                            • Databricks Pricing
                                                              Explore product pricing, DBUs and more
                                                              • Cost Calculator
                                                                Estimate your compute costs on any cloud
                                                              • Open Source
                                                                • Open Source Technologies
                                                                  Learn more about the innovations behind the platform
                                                                • Databricks for Industries
                                                                  • Communications
                                                                    • Media and Entertainment
                                                                      • Financial Services
                                                                        • Public Sector
                                                                          • Healthcare & Life Sciences
                                                                            • Retail
                                                                              • Manufacturing
                                                                                • See All Industries
                                                                                • Cross Industry Solutions
                                                                                  • Customer Data Platform
                                                                                    • Cybersecurity
                                                                                    • Migration & Deployment
                                                                                      • Data Migration
                                                                                        • Professional Services
                                                                                        • Solution Accelerators
                                                                                          • Explore Accelerators
                                                                                            Move faster toward outcomes that matter
                                                                                          • Training and Certification
                                                                                            • Learning Overview
                                                                                              Hub for training, certification, events and more
                                                                                              • Training Overview
                                                                                                Discover curriculum tailored to your needs
                                                                                                • Databricks Academy
                                                                                                  Sign in to the Databricks learning platform
                                                                                                  • Certification
                                                                                                    Gain recognition and differentiation
                                                                                                    • University Alliance
                                                                                                      Want to teach Databricks? See how.
                                                                                                    • Events
                                                                                                      • Data + AI Summit
                                                                                                        • Data + AI World Tour
                                                                                                          • Data Intelligence Days
                                                                                                            • Event Calendar
                                                                                                            • Blog and Podcasts
                                                                                                              • Databricks Blog
                                                                                                                Explore news, product announcements, and more
                                                                                                                • Databricks Mosaic Research Blog
                                                                                                                  Discover the latest in our Gen AI research
                                                                                                                  • Data Brew Podcast
                                                                                                                    Let’s talk data!
                                                                                                                    • Champions of Data + AI Podcast
                                                                                                                      Insights from data leaders powering innovation
                                                                                                                    • Get Help
                                                                                                                      • Customer Support
                                                                                                                        • Documentation
                                                                                                                          • Community
                                                                                                                          • Dive Deep
                                                                                                                            • Resource Center
                                                                                                                              • Demo Center
                                                                                                                              • Company
                                                                                                                                • Who We Are
                                                                                                                                  • Our Team
                                                                                                                                    • Databricks Ventures
                                                                                                                                      • Contact Us
                                                                                                                                      • Careers
                                                                                                                                        • Working at Databricks
                                                                                                                                          • Open Jobs
                                                                                                                                          • Press
                                                                                                                                            • Awards and Recognition
                                                                                                                                              • Newsroom
                                                                                                                                              • Security and Trust
                                                                                                                                                • Security and Trust
                                                                                                                                            • Data and AI summit

                                                                                                                                              JUNE 9–12 | SAN FRANCISCO

                                                                                                                                              700+ sessions on all things data intelligence. Get ready to dive deep.

                                                                                                                                              REGISTER
                                                                                                                                            • Ready to get started?
                                                                                                                                            • Get a Demo
                                                                                                                                            Data and AI summit

                                                                                                                                            JUNE 9–12 | SAN FRANCISCO

                                                                                                                                            700+ sessions on all things data intelligence. Get ready to dive deep.

                                                                                                                                            REGISTER
                                                                                                                                            • Login
                                                                                                                                            • Try Databricks
                                                                                                                                            1. Blog
                                                                                                                                            2. /
                                                                                                                                              Product
                                                                                                                                            3. /
                                                                                                                                              Article

                                                                                                                                            Databricks Workspace Administration – Best Practices for Account, Workspace and Metastore Admins

                                                                                                                                            A tale of three admins

                                                                                                                                            Databricks Workspace Administration - Best Practices for Account, Workspace and Metastore Admins

                                                                                                                                            Published: August 26, 2022

                                                                                                                                            Product12 min read

                                                                                                                                            by Anindita Mahapatra, Mohan Mathews and Greg Wood

                                                                                                                                            Share this post

                                                                                                                                            Keep up with us

                                                                                                                                            This blog is part of our Admin Essentials series, where we discuss topics relevant to Databricks administrators. Other blogs include our Workspace Management Best Practices, DR Strategies with Terraform, and many more! Keep an eye out for more content coming soon. In past admin-focused blogs, we have discussed how to establish and maintain a strong workspace organization through upfront design and automation of aspects such as DR, CI/CD, and system health checks. An equally important aspect of administration is how you organize within your workspaces- especially when it comes to the many different types of admin personas that may exist within a Lakehouse. In this blog we will talk about the administrative considerations of managing a workspace, such as how to:

                                                                                                                                            • Set up policies and guardrails to future-proof onboarding of new users and use cases
                                                                                                                                            • Govern usage of resources
                                                                                                                                            • Ensure permissible data access
                                                                                                                                            • Optimize compute usage to make the most of your investment

                                                                                                                                            In order to understand the delineation of roles, we first need to understand the distinction between an Account Administrator and a Workspace Administrator, and the specific components that each of these roles manage.

                                                                                                                                            Account Admins Vs Workspace Admins Vs Metastore Admins

                                                                                                                                            Administrative concerns are split across both accounts (a high-level construct that is often mapped 1:1 with your organization) & workspaces (a more granular level of isolation that can be mapped various ways, i.e, by LOB). Let's take a look at the separation of duties between these three roles.

                                                                                                                                            Figure-1 Account Console
                                                                                                                                            Figure-1 Account Console

                                                                                                                                            To state this in a different way, we can break down the primary responsibilities of an Account Administrator as the following:

                                                                                                                                            • Provisioning of Principals(Groups/Users/Service) and SSO at the account level. Identity Federation refers to assigning Account Level Identities access to workspaces directly from the account.
                                                                                                                                            • Configuration of Metastores
                                                                                                                                            • Setting up Audit Log
                                                                                                                                            • Monitoring Usage at the Account level (DBU, Billing)
                                                                                                                                            • Creating workspaces according to the desired organization method
                                                                                                                                            • Managing other workspace-level objects (storage, credentials, network, etc.)
                                                                                                                                            • Automating dev workloads using IaaC to remove the human element in prod workloads
                                                                                                                                            • Turning features on/off at Account level such as serverless workloads, Delta sharing

                                                                                                                                            Figure-2 Account Artifacts
                                                                                                                                            Figure-2 Account Artifacts

                                                                                                                                            On the other hand, the primary concerns of a Workspace Administrator are:

                                                                                                                                            • Assigning appropriate Roles (User/Admin) at the workspace level to Principals
                                                                                                                                            • Assigning appropriate Entitlements (ACLs) at the workspace level to Principals
                                                                                                                                            • Optionally setting SSO at the workspace level
                                                                                                                                            • Defining Cluster Policies to entitle Principals to enable them to
                                                                                                                                              • Define compute resource (Clusters/Warehouses/Pools)
                                                                                                                                              • Define Orchestration (Jobs/Pipelines/Workflows)
                                                                                                                                            • Turning features on/off at Workspace level
                                                                                                                                            • Assigning entitlements to Principals
                                                                                                                                              • Data Access (when using internal/external hive metastore)
                                                                                                                                              • Manage Principals' access to compute resources
                                                                                                                                            • Managing external URLs for features such as Repos (including allow-listing)
                                                                                                                                            • Controlling security & data protection
                                                                                                                                              • Turn off / restrict DBFS to prevent accidental data exposure across teams
                                                                                                                                              • Prevent downloading result data (from notebooks/DBSQL) to prevent data exfiltration
                                                                                                                                              • Enable Access Control (Workspace Objects, Clusters, Pools, Jobs, Tables etc)
                                                                                                                                            • Defining log delivery at the cluster level (i.e., setting up storage for cluster logs, ideally through Cluster Policies)

                                                                                                                                            Figure-3 Workspace Artifacts
                                                                                                                                            Figure-3 Workspace Artifacts

                                                                                                                                            To summarize the differences between the account and workspace admin, the table below captures the separation between these two personas for a few key dimensions:

                                                                                                                                              Account Admin Metastore Admin Workspace Admin
                                                                                                                                            Workspace Management - Create, Update, Delete workspaces
                                                                                                                                            - Can add other admins
                                                                                                                                            Not Applicable - Only Manages assets within a workspace
                                                                                                                                            User Management - Create users, groups and service principals or use SCIM to sync data from IDPs.
                                                                                                                                            - Entitle Principals to Workspaces with the Permission Assignment API
                                                                                                                                            Not Applicable - We recommend use of the UC for central governance of all your data assets(securables). Identity Federation will be On for any workspace linked to a Unity Catalog (UC) Metastore.
                                                                                                                                            - For workspaces enabled on Identity Federation, setup SCIM at the Account Level for all Principals and stop SCIM at the Workspace Level.
                                                                                                                                            - For non-UC Workspaces, you can SCIM at the workspace level (but these users will also be promoted to account level identities).
                                                                                                                                            - Groups created at workspace level will be considered "local" workspace-level groups and will not have access to Unity Catalog
                                                                                                                                            Data Access and Management - Create Metastore(s)
                                                                                                                                            - Link Workspace(s) to Metatore
                                                                                                                                            - Transfer ownership of metastore to Metastore Admin/group
                                                                                                                                            With Unity Catalog:
                                                                                                                                            -Manage privileges on all the securables (catalog, schema, tables, views) of the metastore
                                                                                                                                            - GRANT (Delegate) Access to Catalog, Schema(Database), Table, View, External Locations and Storage Credentials to Data Stewards/Owners
                                                                                                                                            - Today with Hive-metastore(s), customers use a variety of constructs to protect data access, such as Instance Profiles on AWS, Service Principals in Azure, Table ACLs, Credential Passthrough, among others.
                                                                                                                                            -With Unity Catalog, this is defined at the account level and ANSI GRANTS will be used to ACL all securables
                                                                                                                                            Cluster Management Not Applicable Not Applicable - Create clusters for various personas/sizes for DE/ML/SQL personas for S/M/L workloads
                                                                                                                                            - Remove allow-cluster-create entitlement from default users group.
                                                                                                                                            - Create Cluster Policies, grant access to policies to appropriate groups
                                                                                                                                            - Give Can_Use entitlement to groups for SQL Warehouses
                                                                                                                                            Workflow Management Not Applicable Not Applicable - Ensure job/DLT/all-purpose cluster policies exist and groups have access to them
                                                                                                                                            - Pre-create app-purpose clusters that users can restart
                                                                                                                                            Budget Management - Set up budgets per workspace/sku/cluster tags
                                                                                                                                            - Monitor Usage by tags in the Accounts Console (roadmap)
                                                                                                                                            - Billable usage system table to query via DBSQL (roadmap)
                                                                                                                                            Not Applicable Not Applicable
                                                                                                                                            Optimize / Tune Not Applicable Not Applicable - Maximize Compute; Use latest DBR; Use Photon
                                                                                                                                            - Work alongside Line Of Business/Center Of Excellence teams to follow best practices and optimizations to make the most of the infrastructure investment

                                                                                                                                            Figure-4 Databricks Admin Persona Responsibilities
                                                                                                                                            Figure-4 Databricks Admin Persona Responsibilities

                                                                                                                                            Sizing a workspace to meet peak compute needs

                                                                                                                                            The max number of cluster nodes (indirectly the largest job or the max number of concurrent jobs) is determined by the max number of IPs available in the VPC and hence sizing the VPC correctly is an important design consideration. Each node takes up 2 IPs (in Azure, AWS). Here are the relevant details for the cloud of your choice: AWS, Azure, GCP. We'll use an example from Databricks on AWS to illustrate this. Use this to map CIDR to IP. The VPC CIDR range allowed for an E2 workspace is /25 - /16. At least 2 private subnets in 2 different availability zones must be configured. The subnet masks should be between /16-/17. VPCs are logical isolation units and as long as 2 VPCs do not need to talk, i.e. peer to each other, they can have the same range. However, if they do, then care has to be taken to avoid IP overlap. Let us take an example of a VPC with CIDR rage /16:

                                                                                                                                            VPC CIDR /16 Max # IPs for this VPC: 65,536 Single/multi-node clusters are spun up in a subnet
                                                                                                                                            2 AZs If each AZ is /17 : => 32,768 * 2 = 65,536 IPs no other subnet is possible 32,768 IPs => max of 16,384 nodes in each subnet
                                                                                                                                              If each AZ is /23 instead: => 512 * 2 = 1,024 IPs 65,536 - 1,024 = 64, 512 IPs left 512 IPs => max of 256 nodes in each subnet
                                                                                                                                            4 AZs If each AZ is /18: 16,384 * 4 = 65,536 IPs no other subnet is possible 16,384 IPs => max of 8192 nodes in each subnet

                                                                                                                                            Balancing control & agility for workspace admins

                                                                                                                                            Compute is the most expensive component of any cloud infrastructure investment. Data democratization leads to innovation and facilitating self-service is the first step towards enabling a data driven culture. However, in a multi-tenant environment, an inexperienced user or an inadvertent human error could lead to runaway costs or inadvertent exposure. If controls are too stringent, it will create access bottlenecks and stifle innovation. So, admins need to set guard-rails to allow self-service without the inherent risks. Further, they should be able to monitor the adherence of these controls. This is where Cluster Policies come in handy, where the rules are defined and entitlements mapped so the user operates within permissible perimeters and their decision-making process is greatly simplified. It should be noted that policies should be backed by process to be truly effective so that one off exceptions can be managed by process to avoid unnecessary chaos. One critical step of this process is to remove the allow-cluster-create entitlement from the default users group in a workspace so that users can only utilize compute governed by Cluster Policies. The following are top recommendations of Cluster Policy Best Practices and can be summarized as below:

                                                                                                                                            • Use T-shirt sizes to provide standard cluster templates
                                                                                                                                              • By workload size (small, medium, large)
                                                                                                                                              • By persona (DE/ ML/ BI)
                                                                                                                                              • By proficiency (citizen/ advanced)
                                                                                                                                            • Manage Governance by enforcing use of
                                                                                                                                              • Tags : attribution by team, user, use case
                                                                                                                                                • naming should be standardized
                                                                                                                                                • making some attributes mandatory helps for consistent reporting
                                                                                                                                            • Control Consumption by limiting
                                                                                                                                              • DBU Burn rate and purpose of policy
                                                                                                                                              • Auto-termination timeout, Scaling min/max size

                                                                                                                                            Compute considerations

                                                                                                                                            Unlike fixed on-prem compute infrastructure, cloud gives us elasticity as well as flexibility to match the right compute to the workload and SLA under consideration. The diagram below shows the various options. The inputs are parameters such as type of workload or environment and the output is the type and size of compute that is a best-fit.

                                                                                                                                            Figure-5 Deciding the right compute
                                                                                                                                            Figure-5 Deciding the right compute

                                                                                                                                            For example, a production DE workload should always be on automated job clusters preferably with the latest DBR, with autoscaling and using the photon engine. The table below captures some common scenarios.

                                                                                                                                            Workflow considerations

                                                                                                                                            Now that the compute requirements have been formalized, we need to look at

                                                                                                                                            • How Workflows will be defined and triggered
                                                                                                                                            • How Tasks can reuse compute amongst themselves
                                                                                                                                            • How Task dependencies will be managed
                                                                                                                                            • How failed tasks can be retried
                                                                                                                                            • How version upgrades (spark, library) and patches are applied

                                                                                                                                            These are Date Engineering and DevOps considerations that are centered around the use case and is typically a direct concern of an administrator. There are some hygiene tasks that can be monitored such as

                                                                                                                                            • A workspace has a max limit on the total number of configured jobs. But a lot of these jobs may not be invoked and need to be cleaned up to make space for genuine ones. An administrator can run checks to determine the valid eviction list of defunct jobs.
                                                                                                                                            • All production jobs should be run as a service principal and user access to a production environment should be highly restricted. Review the Jobs permissions.
                                                                                                                                            • Jobs can fail, so every job should be set for failure alerts and optionally for retries. Review email_notifications, max_retries and other properties here
                                                                                                                                            • Every job should be associated with cluster policies and tagged properly for attribution.

                                                                                                                                            DLT: Example of an ideal framework for reliable pipelines at scale

                                                                                                                                            Working with thousands of clients big and small across different industry verticals, common data challenges for development and operationalization became apparent, which is why Databricks created Delta Live Tables (DLT). It is a managed platform offering to simplify ETL workload development and maintenance by allowing creation of declarative pipelines where you specify the 'what' & not the 'how'. This simplifies the tasks of a data engineer, leading to fewer support scenarios for administrators.

                                                                                                                                            Figure-6 DLT simplifies the Admin's role of managing pipelines
                                                                                                                                            Figure-6 DLT simplifies the Admin's role of managing pipelines

                                                                                                                                            DLT incorporates common admin functionality such as periodic optimize & vacuum jobs right into the pipeline definition with a maintenance job that ensures that they run without additional babysitting. DLT offers deep observability into pipelines for simplified operations such as lineage, monitoring and data quality checks. For example, if the cluster terminates, the platform auto-retries (in Production mode) instead of relying on the data engineer to have provisioned it explicitly. Enhanced Auto-Scaling can handle sudden data bursts that require cluster upsizing and downscale gracefully. In other words, automated cluster scaling & pipeline fault tolerance is a platform feature. Turntable latencies enable you to run pipelines in batch or streaming and move dev pipelines to prod with relative ease by managing configuration instead of code. You can control the cost of your Pipelines by utilizing DLT-specific Cluster Policies. DLT also auto-upgrades your runtime engine, thus removing the responsibility from Admins or Data Engineers, and allowing you to focus only on generating business value.

                                                                                                                                            UC: Example of an ideal Data Governance framework

                                                                                                                                            Unity Catalog (UC) enables organizations to adopt a common security model for tables and files for all workspaces under a single account, which was not possible before through simple GRANT statements. By granting and auditing all access to data, tables/or files, from a DE/DS cluster or SQL Warehouse, organizations can simplify their audit and monitoring strategy without relying on per-cloud primitives. The primary capabilities that UC provides include:

                                                                                                                                            Figure-7 UC simplifies the Admin's role of managing data governance
                                                                                                                                            Figure-7 UC simplifies the Admin's role of managing data governance

                                                                                                                                            UC simplifies the job of an administrator (both at the account and workspace level) by centralizing the definitions, monitoring and discoverability of data across the metastore, and making it easy to securely share data irrespective of the number of workspaces that are attached to it.. Utilizing the Define Once, Secure Everywhere model, this has the added advantage of avoiding accidental data exposure in the scenario of a user's privileges inadvertently misrepresented in one workspace which may give them a backdoor to get to data that was not intended for their consumption. All of this can be accomplished easily by utilizing Account Level Identities and Data Permissions. UC Audit Logging allows full visibility into all actions by all users at all levels on all objects, and if you configure verbose audit logging, then each command executed, from a notebook or Databricks SQL, is captured. Access to securables can be granted by either a metastore admin, the owner of an object, or the owner of the catalog or schema that contains the object. It is recommended that the account-level admin delegate the metastore role by nominating a group to be the metastore admins whose sole purpose is granting the right access privileges.

                                                                                                                                            Recommendations and best practices

                                                                                                                                            • Roles and responsibilities of Account admins, Metastore admins and Workspace admins are well-defined and complementary. Workflows such as automation, change requests, escalations, etc. should flow to the appropriate owners, whether the workspaces are set up by LOB or managed by a central Center of Excellence.
                                                                                                                                            • Account Level Identities should be enabled as this allows for centralized principal management for all workspaces, thereby simplifying administration. We recommend setting up features like SSO, SCIM and Audit Logs at the account level. Workspace-level SSO is still required, until the SSO Federation feature is available.
                                                                                                                                            • Cluster Policies are a powerful lever that provides guardrails for effective self-service and greatly simplifies the role of a workspace administrator. We provide some sample policies here. The account admin should provide simple default policies based on primary persona/t-shirt size, ideally through automation such as Terraform. Workspace admins can add to that list for more fine-grained controls. Combined with an adequate process, all exception scenarios can be accommodated gracefully.
                                                                                                                                            • Tracking the on-going consumption for all workload types across all workspaces is visible to account admins via the accounts console. We recommend setting up billable usage log delivery so that it all goes to your central cloud storage for chargeback and analysis. Budget API (In Preview) should be configured at the account level, which allows account administrators to create thresholds at the workspaces, SKU, and cluster tags level and receive alerts on consumption so that timely action can be taken to remain within allotted budgets. Use a tool such as Overwatch to track usage at an even more granular level to help identify areas of improvement when it comes to utilization of compute resources.
                                                                                                                                            • The Databricks platform continues to innovate and simplify the job of the various data personas by abstracting common admin functionalities into the platform. Our recommendation is to use Delta Live Tables for new pipelines and Unity Catalog for all your user management and data access control.

                                                                                                                                            Finally, it's important to note that for most of these best practices, and in fact, most of the things we mention in this blog, coordination, and teamwork are tantamount to success. Although it's theoretically possible for Account and Workspace admins to exist in a silo, this not only goes against the general Lakehouse principles but makes life harder for everyone involved. Perhaps the most important suggestion to take away from this article is to connect Account / Workspace Admins + Project / Data Leads + Users within your own organization. Mechanisms such as Teams/Slack channel, an email alias, and/or a weekly meetup have been proven successful. The most effective organizations we see here at Databricks are those that embrace openness not just in their technology, but in their operations. Keep an eye out for more admin-focused blogs coming soon, from logging and exfiltration recommendations to exciting roundups of our platform features focused on management.

                                                                                                                                            Keep up with us

                                                                                                                                            Recommended for you

                                                                                                                                            Share this post

                                                                                                                                            Never miss a Databricks post

                                                                                                                                            Subscribe to the categories you care about and get the latest posts delivered to your inbox

                                                                                                                                            Sign up

                                                                                                                                            What's next?

                                                                                                                                            How to present and share your Notebook insights in AI/BI Dashboards

                                                                                                                                            Product

                                                                                                                                            November 21, 2024/3 min read

                                                                                                                                            How to present and share your Notebook insights in AI/BI Dashboards

                                                                                                                                            A screenshot of Mosaic AI Model Serving dashboard for deploying and managing fine-tuned LLaMA models.

                                                                                                                                            Product

                                                                                                                                            December 10, 2024/7 min read

                                                                                                                                            Batch Inference on Fine Tuned Llama Models with Mosaic AI Model Serving

                                                                                                                                            databricks logo
                                                                                                                                            Why Databricks
                                                                                                                                            Discover
                                                                                                                                            • For Executives
                                                                                                                                            • For Startups
                                                                                                                                            • Lakehouse Architecture
                                                                                                                                            • Mosaic Research
                                                                                                                                            Customers
                                                                                                                                            • Featured
                                                                                                                                            • See All
                                                                                                                                            Partners
                                                                                                                                            • Cloud Providers
                                                                                                                                            • Technology Partners
                                                                                                                                            • Data Partners
                                                                                                                                            • Built on Databricks
                                                                                                                                            • Consulting & System Integrators
                                                                                                                                            • C&SI Partner Program
                                                                                                                                            • Partner Solutions
                                                                                                                                            Discover
                                                                                                                                            • For Executives
                                                                                                                                            • For Startups
                                                                                                                                            • Lakehouse Architecture
                                                                                                                                            • Mosaic Research
                                                                                                                                            Customers
                                                                                                                                            • Featured
                                                                                                                                            • See All
                                                                                                                                            Partners
                                                                                                                                            • Cloud Providers
                                                                                                                                            • Technology Partners
                                                                                                                                            • Data Partners
                                                                                                                                            • Built on Databricks
                                                                                                                                            • Consulting & System Integrators
                                                                                                                                            • C&SI Partner Program
                                                                                                                                            • Partner Solutions
                                                                                                                                            Product
                                                                                                                                            Databricks Platform
                                                                                                                                            • Platform Overview
                                                                                                                                            • Sharing
                                                                                                                                            • Governance
                                                                                                                                            • Artificial Intelligence
                                                                                                                                            • Business Intelligence
                                                                                                                                            • Data Management
                                                                                                                                            • Data Warehousing
                                                                                                                                            • Real-Time Analytics
                                                                                                                                            • Data Engineering
                                                                                                                                            • Data Science
                                                                                                                                            Pricing
                                                                                                                                            • Pricing Overview
                                                                                                                                            • Pricing Calculator
                                                                                                                                            Open Source
                                                                                                                                            Integrations and Data
                                                                                                                                            • Marketplace
                                                                                                                                            • IDE Integrations
                                                                                                                                            • Partner Connect
                                                                                                                                            Databricks Platform
                                                                                                                                            • Platform Overview
                                                                                                                                            • Sharing
                                                                                                                                            • Governance
                                                                                                                                            • Artificial Intelligence
                                                                                                                                            • Business Intelligence
                                                                                                                                            • Data Management
                                                                                                                                            • Data Warehousing
                                                                                                                                            • Real-Time Analytics
                                                                                                                                            • Data Engineering
                                                                                                                                            • Data Science
                                                                                                                                            Pricing
                                                                                                                                            • Pricing Overview
                                                                                                                                            • Pricing Calculator
                                                                                                                                            Integrations and Data
                                                                                                                                            • Marketplace
                                                                                                                                            • IDE Integrations
                                                                                                                                            • Partner Connect
                                                                                                                                            Solutions
                                                                                                                                            Databricks For Industries
                                                                                                                                            • Communications
                                                                                                                                            • Financial Services
                                                                                                                                            • Healthcare and Life Sciences
                                                                                                                                            • Manufacturing
                                                                                                                                            • Media and Entertainment
                                                                                                                                            • Public Sector
                                                                                                                                            • Retail
                                                                                                                                            • View All
                                                                                                                                            Cross Industry Solutions
                                                                                                                                            • Customer Data Platform
                                                                                                                                            • Cybersecurity
                                                                                                                                            Data Migration
                                                                                                                                            Professional Services
                                                                                                                                            Solution Accelerators
                                                                                                                                            Databricks For Industries
                                                                                                                                            • Communications
                                                                                                                                            • Financial Services
                                                                                                                                            • Healthcare and Life Sciences
                                                                                                                                            • Manufacturing
                                                                                                                                            • Media and Entertainment
                                                                                                                                            • Public Sector
                                                                                                                                            • Retail
                                                                                                                                            • View All
                                                                                                                                            Cross Industry Solutions
                                                                                                                                            • Customer Data Platform
                                                                                                                                            • Cybersecurity
                                                                                                                                            Resources
                                                                                                                                            Documentation
                                                                                                                                            Customer Support
                                                                                                                                            Community
                                                                                                                                            Training and Certification
                                                                                                                                            • Learning Overview
                                                                                                                                            • Training Overview
                                                                                                                                            • Certification
                                                                                                                                            • University Alliance
                                                                                                                                            • Databricks Academy Login
                                                                                                                                            Events
                                                                                                                                            • Data + AI Summit
                                                                                                                                            • Data + AI World Tour
                                                                                                                                            • Data Intelligence Days
                                                                                                                                            • Full Calendar
                                                                                                                                            Blog and Podcasts
                                                                                                                                            • Databricks Blog
                                                                                                                                            • Databricks Mosaic Research Blog
                                                                                                                                            • Data Brew Podcast
                                                                                                                                            • Champions of Data & AI Podcast
                                                                                                                                            Training and Certification
                                                                                                                                            • Learning Overview
                                                                                                                                            • Training Overview
                                                                                                                                            • Certification
                                                                                                                                            • University Alliance
                                                                                                                                            • Databricks Academy Login
                                                                                                                                            Events
                                                                                                                                            • Data + AI Summit
                                                                                                                                            • Data + AI World Tour
                                                                                                                                            • Data Intelligence Days
                                                                                                                                            • Full Calendar
                                                                                                                                            Blog and Podcasts
                                                                                                                                            • Databricks Blog
                                                                                                                                            • Databricks Mosaic Research Blog
                                                                                                                                            • Data Brew Podcast
                                                                                                                                            • Champions of Data & AI Podcast
                                                                                                                                            About
                                                                                                                                            Company
                                                                                                                                            • Who We Are
                                                                                                                                            • Our Team
                                                                                                                                            • Databricks Ventures
                                                                                                                                            • Contact Us
                                                                                                                                            Careers
                                                                                                                                            • Open Jobs
                                                                                                                                            • Working at Databricks
                                                                                                                                            Press
                                                                                                                                            • Awards and Recognition
                                                                                                                                            • Newsroom
                                                                                                                                            Security and Trust
                                                                                                                                            Company
                                                                                                                                            • Who We Are
                                                                                                                                            • Our Team
                                                                                                                                            • Databricks Ventures
                                                                                                                                            • Contact Us
                                                                                                                                            Careers
                                                                                                                                            • Open Jobs
                                                                                                                                            • Working at Databricks
                                                                                                                                            Press
                                                                                                                                            • Awards and Recognition
                                                                                                                                            • Newsroom
                                                                                                                                            databricks logo

                                                                                                                                            Databricks Inc.
                                                                                                                                            160 Spear Street, 15th Floor
                                                                                                                                            San Francisco, CA 94105
                                                                                                                                            1-866-330-0121

                                                                                                                                            See Careers
                                                                                                                                            at Databricks

                                                                                                                                            © Databricks 2025. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the Apache Software Foundation.

                                                                                                                                            • Privacy Notice
                                                                                                                                            • |Terms of Use
                                                                                                                                            • |Modern Slavery Statement
                                                                                                                                            • |California Privacy
                                                                                                                                            • |Your Privacy Choices