Skip to main content

Introducing Serverless Support for AWS Instance Profiles

Easily implement uniform data access configurations across the platform
Share this post

Summary

  • Simplified Data Access: Serverless compute inherits AWS instance profiles and Spark configurations for secure, centralized data access across workloads.
  • Improved Security: Leverages IAM-based controls, reducing risks tied to embedded credentials or manual configurations.
  • Enhanced User Experience: Automates setup with existing configurations, enabling developers to focus on building data-driven solutions.

Introducing Serverless Support for AWS Instance Profiles: Uniform Data Access

At Databricks, we continuously strive to simplify data access and drive innovation across our platform. Today, we’re excited to announce serverless support for AWS instance profiles. This milestone enables seamless access to data stored in AWS cloud services for customers leveraging serverless compute for notebooks, jobs, and Delta Live Tables (DLT) pipelines.

Why This Matters

Traditionally, configuring access to data stored in AWS required managing credentials directly or embedding them in configurations, which introduced operational complexities and security risks. With this update, serverless compute inherits the same data access configuration as serverless SQL Warehouses, including:

  • AWS Instance Profiles: A secure, scalable way to access data in Amazon S3 without embedding credentials.
  • Spark Configurations: Workspace-wide Spark settings to streamline access to external Hive Metastores and third-party integrations.

It’s now easier than ever to unify secure access across serverless Jobs, Notebooks, DLT pipelines, and SQL Warehouses, especially for customers still migrating their data to Unity Catalog.

Key Benefits

  1. Easier Adoption of Serverless Compute: Customers can now confidently use serverless compute to access data managed outside of Unity Catalog or stored in arbitrary AWS locations, removing dependency on DBFS mount points or embedded credentials.
  2. Seamless Integration with Existing Configurations: Serverless compute clusters automatically inherit instance profiles and Spark configurations already set up for SQL Warehouses, simplifying the transition for admins.
  3. Enhanced Security and Governance: This update supports robust IAM-based access control, aligning with enterprise governance models and compliance requirements.
  4. Streamlined User Experience: Developers can focus on building and deploying workloads without worrying about low-level access configurations or managing credentials.

What’s New in the User Experience?

Workspace admins can configure data access for serverless compute directly in the familiar Compute management interface. Here’s how it works:

  • Existing SQL Warehouses Configurations: If your workspace already uses an instance profile or Spark configurations for SQL Warehouses, serverless compute clusters will inherit these settings automatically—no additional setup required.
  • Setting Up New Configurations: Admins can easily specify instance profiles and optional Spark configurations to enable serverless compute access to non-Unity Catalog data. These settings apply uniformly across SQL Warehouses and serverless compute.

Impact on Data Teams

This capability unlocks new possibilities for customers across industries:

  • Data Analysts and Engineers: Seamlessly access legacy Hive Metastore data, enabling rapid data exploration and job execution without waiting for Unity Catalog migration.
  • Admins and Security Teams: Leverage AWS instance profiles to enforce least-privilege access, minimizing the risk of inadvertent privilege escalation.
  • Developers: Integrate with third-party tools and data services effortlessly, thanks to consistent Spark configuration support.

Get Started Today

Serverless support for AWS instance profiles is available now in Databricks. For detailed setup instructions, visit our documentation.

Try Databricks for free

Related posts

Announcing the General Availability of Serverless Compute for Notebooks, Workflows and Delta Live Tables

July 15, 2024 by Bilal Aslam and Lucian Popa in
We are excited to announce the General Availability of serverless compute for notebooks, jobs and Delta Live Tables (DLT) on AWS and Azure...

Cost savings on serverless compute for Notebooks, Jobs, and Pipelines

September 5, 2024 by Lucian Popa and Bilal Aslam in
We recently announced the General Availability of our serverless compute offerings for Notebooks, Jobs, and Pipelines. Serverless compute provides rapid workload startup, automatic...

Introducing Databricks LakeFlow: A unified, intelligent solution for data engineering

June 13, 2024 by Michael Armbrust and Bilal Aslam in
Today, we are excited to announce Databricks LakeFlow, a new solution that contains everything you need to build and operate production data pipelines...
See all Platform posts