Introducing Serverless Support for AWS Instance Profiles

Easily implement uniform data access configurations across the platform

Published: February 6, 2025

Summary

Simplified Data Access: Serverless compute inherits AWS instance profiles and Spark configurations for secure, centralized data access across workloads.
Improved Security: Leverages IAM-based controls, reducing risks tied to embedded credentials or manual configurations.
Enhanced User Experience: Automates setup with existing configurations, enabling developers to focus on building data-driven solutions.

Introducing Serverless Support for AWS Instance Profiles: Uniform Data Access

At Databricks, we continuously strive to simplify data access and drive innovation across our platform. Today, we’re excited to announce serverless support for AWS instance profiles. This milestone enables seamless access to data stored in AWS cloud services for customers leveraging serverless compute for notebooks, jobs, and Delta Live Tables (DLT) pipelines.

Why This Matters

Traditionally, configuring access to data stored in AWS required managing credentials directly or embedding them in configurations, which introduced operational complexities and security risks. With this update, serverless compute inherits the same data access configuration as serverless SQL Warehouses, including:

AWS Instance Profiles: A secure, scalable way to access data in Amazon S3 without embedding credentials.
Spark Configurations: Workspace-wide Spark settings to streamline access to external Hive Metastores and third-party integrations.

It’s now easier than ever to unify secure access across serverless Jobs, Notebooks, DLT pipelines, and SQL Warehouses, especially for customers still migrating their data to Unity Catalog.

Key Benefits

Easier Adoption of Serverless Compute: Customers can now confidently use serverless compute to access data managed outside of Unity Catalog or stored in arbitrary AWS locations, removing dependency on DBFS mount points or embedded credentials.
Seamless Integration with Existing Configurations: Serverless compute clusters automatically inherit instance profiles and Spark configurations already set up for SQL Warehouses, simplifying the transition for admins.
Enhanced Security and Governance: This update supports robust IAM-based access control, aligning with enterprise governance models and compliance requirements.
Streamlined User Experience: Developers can focus on building and deploying workloads without worrying about low-level access configurations or managing credentials.

What’s New in the User Experience?

Workspace admins can configure data access for serverless compute directly in the familiar Compute management interface. Here’s how it works:

Existing SQL Warehouses Configurations: If your workspace already uses an instance profile or Spark configurations for SQL Warehouses, serverless compute clusters will inherit these settings automatically—no additional setup required.
Setting Up New Configurations: Admins can easily specify instance profiles and optional Spark configurations to enable serverless compute access to non-Unity Catalog data. These settings apply uniformly across SQL Warehouses and serverless compute.

Impact on Data Teams

This capability unlocks new possibilities for customers across industries:

Data Analysts and Engineers: Seamlessly access legacy Hive Metastore data, enabling rapid data exploration and job execution without waiting for Unity Catalog migration.
Admins and Security Teams: Leverage AWS instance profiles to enforce least-privilege access, minimizing the risk of inadvertent privilege escalation.
Developers: Integrate with third-party tools and data services effortlessly, thanks to consistent Spark configuration support.

Get Started Today

Serverless support for AWS instance profiles is available now in Databricks. For detailed setup instructions, visit our documentation.

What's next?

November 20, 2024/4 min read

Introducing Predictive Optimization for Statistics

November 21, 2024/3 min read