Skip to main content

Proprietary Foundation Model Serving

Serve state-of-the-art proprietary foundation models for real-time and batch inference workload needs. This enables you to quickly and easily build applications that leverage high-quality proprietary generative AI models from various vendors directly on the Databricks platform without the need to additionally and separately engage with other vendors.

Loading...

* For Azure customers, if you have an Azure Commit with Databricks, Databricks may make available this service as an ADI Service that integrates with Azure Databricks. The ADI Service is sold and invoiced by Databricks. Contact Sales to get access.
1. Azure Databricks, as a first-party service on Microsoft Azure, offers unified billing and support by Microsoft
   The Premium tier on Azure Databricks corresponds to the Enterprise tier on AWS and GCP

Proprietary Foundation Model Serving DBU rates

ModelEndpoint typeContext Length

Pay Per Token

Batch Inference
InputOutputCache writesCache reads 
DBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / hour
OpenAI
GPT 5.5GlobalShort71.429428.57171.4297.143214.286
In-geo78.572471.42878.5727.857235.715
GPT 5.4 / 5.5 ProGlobalShort428.5712,571.429428.57142.8571,142.857
In-geo471.4282,828.572471.42847.1431,257.143
GPT 5.4 / 5.5 ProGlobalLong857.1423,857.144857.14285.7141,142.857
In-geo942.8564,242.858942.85694.2861,257.143
GPT 5.4GlobalShort35.714214.28635.7143.571192.857
In-geo39.285235.71539.2853.929212.143
GPT 5.4GlobalLong71.428321.42971.4287.143192.857
In-geo78.571353.57278.5717.857212.143
GPT 5.4 miniGlobalAll Lengths10.71464.28610.7141.071107.143
In-geo11.78670.71411.7861.179117.857
GPT 5.4 nanoGlobalAll Lengths2.85717.8572.8570.28671.429
In-geo3.14319.6433.1430.31478.571
GPT 5.2/5.3 CodexGlobalAll Lengths25.000200.00025.0002.500n/a
In-geo27.500220.00027.5002.750n/a
GPT 5.2GlobalAll Lengths25.000200.00025.0002.500184.286
In-geo27.500220.00027.5002.750202.714
GPT 5.1GlobalAll Lengths17.857142.85717.8571.786131.429
In-geo19.643157.14319.6431.965144.571
GPT 5.1 Codex MaxGlobalAll Lengths17.857142.85717.8571.786n/a
In-geo19.643157.14319.6431.965n/a
GPT 5GlobalAll Lengths17.857142.85717.8571.786131.429
In-geo19.643157.14319.6431.965144.571
GPT 5 miniGlobalAll Lengths3.57128.5713.5710.35771.429
In-geo3.92931.4293.9290.39378.571
GPT 5.1 Codex MiniGlobalAll Lengths3.57128.5713.5710.357n/a
In-geo3.92931.4293.9290.393n/a
GPT 5 nanoGlobalAll Lengths0.7145.7140.7140.07153.571
In-geo0.7866.2860.7860.07858.929

Proprietary Foundation Model Serving DBU rates

ModelEndpoint typeContext Length

Pay Per Token

Batch Inference
InputOutputCache writesCache reads 
DBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / hour
Anthropic
Claude Opus 4.5 / 4.6 / 4.7GlobalAll Lengths71.429357.14389.2867.143178.571
In-geo78.571392.85798.2147.857196.429
Claude Opus 4 / 4.1Global/In-geoAll Lengths214.2861,071.429267.85721.429514.286
Claude Sonnet 4.5 / 4.6GlobalAll Lengths42.857214.28653.5714.286214.286
In-geo47.143235.71558.9284.715235.715
Claude Sonnet 4 / 4.1Global/In-geoShort Context42.857214.28653.5714.286214.286
Long Context
(>200k tokens)
85.714321.429107.1438.571214.286
Claude Haiku 4.5GlobalAll Lengths14.28671.42917.8571.429114.286
In-geo15.71578.57219.6431.572125.714

Proprietary Foundation Model Serving DBU rates

ModelEndpoint typeContext Length

Pay Per Token

Batch Inference
InputOutputCache writesCache reads 
DBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / 1M TokensDBU / hour
Google
Gemini 3.5 FlashGlobalAll Lengths26.786160.71426.7862.679coming soon
In-geoAll Lengths29.464176.78629.4642.946coming soon
Gemini 3.1 Flash LiteGlobalAll Lengths4.46426.7864.4640.44689.286
In-geoAll Lengths4.91129.4644.9110.49198.214
Gemini 3.0 / 3.1 ProGlobal/In-geoShort Context35.714214.28635.7143.571230.429
Long Context
(>200k tokens)
71.429321.42971.4297.143230.429
Gemini 3.0 FlashGlobal/In-geoAll Lengths8.92953.5718.9290.893125.000
Gemini 2.5 ProGlobal/In-geoShort Context22.321178.57122.3212.232164.286
Long Context
(>200k tokens)
44.643267.85744.6434.464164.286
Gemini 2.5 FlashGlobal/In-geoAll Lengths5.35744.6435.3570.536107.143
Gemini 2.5 Flash LiteGlobal/In-geoAll Lengths1.7867.1431.7860.179n/a

NOTE: The Gemini model DBU rates shown here do not include a promotional discount of 20% (promotional pricing is 20% lower than shown). The promotion will run until June 30, 2026 after which all prices will revert to the DBU rates shown in this table.

Pay as you go with a 14-day free trial or contact us for committed-use discounts or custom requirements.

Proprietary Foundation Model Serving FAQ