Security Addendum

This Security Addendum is incorporated into and made a part of the written agreement between Databricks, Inc. (“Databricks”) and Customer that references this Security Addendum (“Agreement”).

Databricks maintains a comprehensive documented security program that is based on industry standard security frameworks including ISO 27001 and ISO 27018 (the “Security Program”). Pursuant to the Security Program, Databricks implements and maintains administrative, physical, and technical security measures to protect the Platform Services and Support Services and the security and confidentiality of Customer Content (including any Customer Personal Data that may be contained therein) (each as defined in the Agreement) under Databricks’ control that is processed by Databricks in its provisioning of the Platform Services or Support Services (the “Security Measures”). Databricks’ compliance with this Addendum shall be deemed to satisfy any more general measures included within any Agreement, including the Service Specific Terms.

In accordance with its Security Program, Databricks will, when any Customer Content is under its control: (i) comply with the Security Measures identified below with respect to such Customer Content, and (ii) where relevant, keep documentation of such Security Measures.

Databricks regularly tests and evaluates its Security Program, and may review and update this Security Addendum at any time without notice, provided that such updates are equivalent (or enhance) security and do not materially diminish the level of protection afforded to Customer Content by these Security Measures.

  1. Deployment Model
    1. Shared Responsibility. Databricks operates in a shared responsibility model, where both Databricks and the Customer maintain security responsibilities. This is covered in more detail in our Documentation.
    2. Architecture. Databricks is a hybrid platform-as-a-service offering. The components responsible for managing and controlling the Platform Services are referred to as the ‘Databricks Control Plane’ and are hosted within a Databricks Cloud Service Provider account. The compute resources that perform data processing operations are referred to as the “Data Plane”. For certain Cloud Service Providers, the Data Plane may either be deployed in the Customer’s Cloud Service Provider account (known as the ‘Customer Data Plane’) or, for Databricks Serverless Compute, in a Databricks-controlled Cloud Service Provider account (known as the ‘Databricks Data Plane’).  Data Plane shall refer to both Customer Data Plane and Databricks Data Plane unless otherwise specified.
    3. Compute Resources. Compute resources are created and coordinated by the Databricks Control Plane and deployed into the Data Plane. Compute resources are launched as new virtual machines that leverage the latest base image and Databricks source code and do not have data from previous machines. When compute resources terminate, the data on their local hard drives is overwritten by Databricks or by the Cloud Service Provider.
    4. Data Storage of Customer Content.
      1. Customer Data and Customer Results.
        1. Customer Control. Most Customer Data is stored within the Customer’s own Cloud Service Provider account at rest (e.g., within Customer’s AWS S3 bucket) or within other Systems under Customer’s control.  Customer may choose where this Customer Data resides (other than the DBFS root, which is deployed into a storage bucket within the applicable Cloud Service Provider in the region in which the Data Plane is deployed). Please see the Documentation for more details.
        2. Databricks Control.  Small amounts of Customer Data may be stored within the Databricks Control Plane, including Customer Results and metadata about Customer Data (e.g., contained within the metastore). Databricks offers Customers options regarding the storage of certain Customer Content within the Platform Services (e.g., the location of Customer Results created by the use of interactive notebooks). Please see the Documentation for more details.
      2. Customer Instructional Input. Customer Instructional Input is stored at rest within the Databricks Control Plane.
  2. Deployment Region. Customers may specify the region(s) where their Platform Services Workspaces are deployed. Customers can choose to deploy the Data Plane into any supported Databricks region. The Databricks Control Plane is deployed into the same region. Databricks will not, without Customers’ permission, move a Customer Workspace into a different region. See the Documentation for details specific to Customer’s Cloud Service Provider.
  3. Databricks’ Audits & Certifications. Databricks uses independent third-party auditors to assess the Databricks Security Program at least annually, as described in the following audits, regulatory standards, and certifications:
    • SOC 2 Type II (report available under NDA)
    • ISO 27001
    • ISO 27018
    • HIPAA (AWS, for HIPAA-compliant deployments)
    • PCI DSS (AWS, for PCI-compliant deployments)
  4. Administrative Controls
    1. Governance. Databricks’ Chief Security Officer leads the Databricks’ Information Security Program and develops, reviews, and approves (together with other stakeholders, such as Legal, Human Resources, Finance, and Engineering) Databricks’ Security Policies (as defined below).
    2. Change Management. Databricks maintains a documented change management policy, reviewed annually, which includes but is not limited to, evaluating changes of or relating to systems authentication.
    3. ISMS; Policies and Procedures. Databricks has implemented a formal Information Security Management System (“ISMS”) in order to protect the confidentiality, integrity, authenticity, and availability of Databricks’ data and information systems, and to ensure the effectiveness of security controls over data and information systems that support operations. The Databricks Security Program implemented under the ISMS includes a comprehensive set of privacy and security policies and procedures developed and maintained by the security, legal, privacy, and information security teams (“Security Policies”). The Security Policies are aligned with information security standards (such as ISO 27001) and cover topics including but not limited to: security controls when accessing Customer Workspaces; confidentiality of Customer Content; acceptable use of company technology, systems and data; processes for reporting security incidents; and privacy and security best practices. The Security Policies are reviewed and updated annually.
    4. Personnel Training. Personnel receive comprehensive training on the Security Policies upon hire and refresher trainings are given annually. Personnel are required to certify and agree to the Security Policies and personnel who violate the Security Policies are subject to disciplinary action, including warnings, suspension and up to (and including) termination.
    5. Personnel Screening and Evaluation. All personnel undergo background checks prior to onboarding (as permitted by local law), which may include, but are not limited to, criminal record checks, employment history verification, education verification, and global sanctions and enforcement checks. Databricks uses a third-party provider to conduct screenings, which vary by jurisdiction and comply with applicable local law. Personnel are required to sign confidentiality agreements.
    6. Monitoring & Logging. Databricks employs monitoring and logging technology to help detect and prevent unauthorized access attempts to its network and equipment.
    7. Access Review. Active users with access to the Platform Services are reviewed at least quarterly and are promptly removed upon termination of employment. As part of the personnel offboarding process, all accesses are revoked and data assets are securely wiped.
    8. Third Party Risk Management. Databricks assesses the security compliance of applicable third parties, including vendors and subprocessors, in order to measure and manage risk. This includes, but is not limited to, conducting a security risk assessment and due diligence prior to engagement and reviewing external audit reports from critical vendors at least annually. In addition, applicable vendors and subprocessors are required to sign a data processing agreement that includes compliance with applicable data protection laws, as well as confidentiality requirements.
  5. Physical and Environmental Controls
    1. Databricks Corporate Offices. Databricks has implemented administrative, physical, and technical safeguards for its corporate offices. These include, but are not limited to, the below:
      • Visitors are required to sign in, acknowledge and accept an NDA, wear an identification badge, and be escorted by Databricks personnel while on premises
      • Databricks personnel badge into the offices
      • Badges are not shared or loaned to others without authorization
      • Physical entry points to office premises are recorded by CCTV and have an access card verification system at every door, allowing only authorized employees to enter the office premises
      • Equipment and other Databricks-issued assets are inventoried and tracked
      • Office Wi-Fi networks are protected with encryption, wireless rogue detection, and Network Access Control
    2. Cloud Service Provider Data Centers. Databricks regularly reviews Cloud Service Provider audits conducted in compliance with ISO 27001, SOC 1, SOC 2, and PCI-DSS. Security controls include, but are not limited to the list below:
      • Biometric facility access controls
      • Visitor facility access policies and procedures
      • 24-hour armed physical security
      • CCTV at ingress and egress
      • Intrusion detection
      • Business continuity and disaster recovery plans
      • Smoke detection sensors and fire suppression equipment
      • Mechanisms to control temperature, humidity and water leaks
      • Power redundancy with backup power supply
  6. Systems & Network Security
    1. Platform Controls.
      1. Isolation. Databricks leverages multiple layers of network security controls, including network-level isolation, for separation between the Databricks Control Plane and Customer Data Plane, and between Workspaces within the Databricks Data Plane. See documentation on Serverless Compute for more details on the difference between Serverless Compute and non-Serverless Compute.
      2. Firewalls & Security Groups. Firewalls are implemented as network access control lists or security groups within the Cloud Service Provider’s account. Databricks also configures local firewalls or security groups within the Customer Data Plane.
      3. Hardening.
        1. Databricks employs industry standards to harden images and operating systems under its control that are deployed within the Platform Services, including deploying baseline images with hardened security configuration such as disabled remote root login, isolation of user code, and images are regularly updated and refreshed.
        2. For Systems under Databricks control supporting the production data processing environment, Databricks tracks security configurations against industry standard baselines such as CIS and STIG.
      4. Encryption
        1. Encryption of data-in-transit. Customer Content is encrypted using cryptographically secure protocols (TLS v.1.2 or higher) in transit between (1) Customer and the Databricks Control Plane and (2) the Databricks Control Plane and the Data Plane.  Additionally, depending on functionality provided by the Cloud Service Provider, Customers may optionally encrypt communications between clusters within the Data Plane (e.g., by utilizing appropriate AWS Nitro instances).
        2. Encryption of data-at-rest. Customer Content is encrypted using cryptographically secure protocols (AES-128 bit, or the equivalent or better) while at rest within the Databricks Control Plane.  Additionally, depending on functionality provided by the Cloud Service Provider, Customers may optionally encrypt at rest Customer Content within the Data Plane. See Documentation on ‘local disk encryption’ for more details.
        3. Review. Cryptographic standards are periodically reviewed and selected technologies and ciphers are updated in accordance with assessed risk and market acceptance of new standards.
        4. Customer Options; Responsibilities. Customers may choose to leverage additional encryption options for data in transit within the Customer Data Plane or Databricks Data Plane as described in the Documentation (e.g., Customer may utilize AWS Nitro EC2 instances within the Customer Data Plane to provide additional encryption in transit). Customer shall, based on the sensitivity of the Customer Content, configure the Platform Services and Customer Systems to encrypt Customer Content where appropriate (e.g., by enabling encryption at rest for data stored within AWS S3).
      5. Monitoring & Logging
        1. Intrusion Detection Systems. Databricks leverages security capabilities provided natively by Cloud Service Providers for security detection.
        2. Audit Logs.
          1. Generation. Databricks generates audit logs from Customer’s use of the Platform Services. The logs are designed to store information about material events within the Platform Services.
          2. Delivery. Customer may, depending on the entitlement tier of the Platform Services, enable delivery of audit logs.  It is Customer’s responsibility to configure this option.
          3. Integrity.  Databricks stores audit logs in a manner designed to protect the audit logs from tampering.
          4. Retention. Databricks stores audit logs for at least one year.
      6. Penetration Testing. Databricks conducts third-party penetration tests at least annually, employs in-house offensive security personnel, and also maintains a public bug bounty program.
      7. Vulnerability Management & Remediation. Databricks regularly runs authenticated scans against representative hosts in the SDLC pipeline to identify vulnerabilities and emerging security threats that may impact the Data Plane and Databricks Control Plane. Databricks will use commercially reasonable efforts to address critical vulnerabilities within 14 days, high severity within 30 days, and medium severity within 60 days measured from, with respect to publicly declared third party vulnerabilities, the date of availability of a compatible, vendor-supplied patch, or for internal vulnerabilities, from the date such vulnerability is confirmed. Databricks leverages the National Vulnerability Database’s Common Vulnerability Scoring System (CVSS), or where applicable, the U.S.-Cert rating, combined with an internal analysis of contextual risk to determine criticality.
      8. Patching.
        1. Control Plane. Databricks deploys new code to the Databricks Control on an ongoing basis.
        2. Data Plane. New Data Plane virtual machines use the latest applicable source code and system images upon launch and do not require Databricks to patch live systems. Customers are encouraged to restart always-on clusters on a periodic basis to take advantage of security patches.
      9. Databricks Personnel Login to Customer Workspaces.  Databricks utilizes an internal technical and organizational control tool called ‘Genie’ that permits Databricks personnel to log in to a Customer Workspace to provide support to our Customers and permits limited Databricks engineering personnel to log in to certain Platform Services infrastructure.  Customer may optionally configure certain limitations on the ability for Databricks personnel to access Customer Workspaces. Please see Documentation on ‘Genie’ for more details, including on which Cloud Service Providers this is offered.
    2. Corporate Controls.
      1. Access Controls
        1. Authentication. Databricks personnel are authenticated through single sign-on (SSO), 802.1x (or similar) where applicable, and use a unique user ID and password combination and multi-factor authentication. Privileges are consistent with least privilege principles. Security Policies prohibits personnel from sharing or reusing credentials, passwords, IDs, or other authentication information. If your identity provider supports the SAML 2.0 protocol, you can use Databricks’ SSO to integrate with your identity provider.
        2. Role-Based Access Controls (RBACs). Only authorized roles are allowed to access systems processing customer and personal data. Databricks enforces RBACs (based on security groups and access control lists), and restricts access to Customer Content based on the principle of ‘least privilege’ and segregation of responsibilities and duties.
      2. Pseudonymization. Information stored in activity logs and databases are protected where appropriate using a unique randomized user identifier to mitigate risk of re-identification of data subjects.
      3. Workstation Controls: Databricks enforces certain security controls on its workstations used by personnel, including:
        • Full-disk encryption
        • Anti-malware software
        • Automatic screen lock after 15 minutes of inactivity
        • Secure VPN
  7. Incident Detection & Response
    1. Detection & Investigation. Databricks’ dedicated Detection engineering team deploys and develops intrusion detection monitoring across its computing resources, with alert notifications sent to the Security Incident Response Team (SIRT) for triage and response. The SIRT employs an incident response framework to manage and minimize the effects of unplanned security events.
    2. Security Incidents; Security Breaches. “Security Breach” means a breach of security leading to any accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to Customer Data under Databricks control. A “Security Incident” is any actual or attempted breach of security that does not rise to the level of a Security Breach. A Security Breach shall not include an unsuccessful attempt or activity that does not compromise the security of Customer Data, including (without limitation) pings and other broadcast attacks of firewalls or edge servers, port scans, unsuccessful log-on attempts, denial of service attacks, packet sniffing (or other unauthorized access to traffic data that does not result in access beyond headers) or similar incidents. Databricks maintains a record of known Security Incidents and Security Breaches that includes description, dates and times of relevant activities, and incident disposition. Suspected and confirmed Security Incidents are investigated by security, operations, or support personnel; and appropriate resolution steps are identified and documented. For any confirmed Security Incidents, Databricks will take appropriate, reasonable steps to minimize product and Customer damage or unauthorized disclosure. All incidents are logged in an incident tracking system that is subject to auditing on an annual basis.
    3. Communications & Cooperation. In accordance with applicable data protection laws, Databricks will notify Customer of a Security Breach for which that Customer is impacted without undue delay after becoming aware of the Security Breach, and take appropriate measures to address the Security Breach, including measures to mitigate any adverse effects resulting from the Security Breach.
  8. Backups, Business Continuity, and Disaster Recovery
    1. Business Continuity and Disaster Recovery. Databricks Business Continuity (BC) and Disaster Recovery (DR) plans are reviewed and drills are conducted annually.
    2. Data Resiliency. Databricks performs backups for the Databricks Control Plane (including any Customer Instructional Input stored therein), generally managed by the Cloud Service Provider capabilities, for data resiliency purposes in the case of a critical systems failure. While Databricks backs up Customer notebooks that persist in the Databricks Control Plane as part of its systems resiliency, those backups are maintained only for emergency recovery purposes and are not available for Customers to use on request for recovery purposes.
    3. No Data Restoration. Due to the hybrid nature of the Databricks Platform, Databricks does not provide backup for Customer Content, and Databricks is unable to restore an individual Customer’s Instructional Input upon request. To assist Customers in backing up Customer Instructional Input, Databricks provides certain features within the Platform Services (like the ability to synchronize notebooks via a customer’s Github or Bitbucket account).
    4. Self-service Access. Databricks makes available certain features within the Platform Services that permit customers to access, export and delete certain Customer Content (e.g., notebooks) contained within the Databricks Control Plane. Please see the Documentation related to ‘manage workspace storage’.
    5. Customer Managed Backups. Customers retain ownership of their Customer Content and must manage their own backups, including to the extent applicable, enabling backup within the Systems in which the Customer Data is stored (e.g., by enabling?).
  9. Data Deletion.
    1. During Use. The Platform Services provide Customers with functionality that permit Customers to delete Customer Content under Databricks’ control.
    2. Upon Workspace Cancellation. Customer Content contained within a Customer Workspace is permanently deleted within thirty (30) days following cancellation of the Workspace.
  10. Secure Software Development Lifecycle (“SDLC”)
    1. Security Champions. Databricks Engineering and the security organization co-run a Security Champions program, in which senior engineers are trained and socialized as virtual members of the security team. Security Champions are available to all engineering staff for design or code review.
    2. Security Design Review. Feature designs are assessed by security personnel for their security impact to the Databricks Platform, for example, additions or modifications to access controls, data flows, and logging.
    3. Security Training. Engineers are required to take Secure SDLC training, including but not limited to, content provided by OWASP.
    4. Peer Code Review. All production code must be approved through a peer code review process.
    5. Change Control. Databricks’ controls are designed to securely manage assets, configurations, and changes throughout the SDLC.
    6. Code Scanning. Static and dynamic code scans are regularly run and reviewed.
    7. Penetration Testing. As part of the Security Design Review process, certain features are identified and subjected to penetration testing prior to release.
    8. Code Approval. Functional owners are required to approve code in their area of responsibility prior to the code being merged for production.
    9. Multi-Factor Authentication. Accessing the Databricks code repository requires Multi-Factor Authentication.
    10. Code Deployment. Production code is deployed via automated continuous integration / continuous deployment (CI/CD) pipeline processes.  The release management teams are separated from the engineering teams that build the product.
    11. Production Separation. Databricks separates production Platform Services Systems from testing and development Platform Services Systems.

Last Revised September 25, 2021.