Mosaic AI Foundation Model Serving
Two ways to purchase
Access and query state-of-the-art open foundation models and use them to quickly and easily build applications that leverage a high-quality generative AI model without maintaining your own model deployment.
Foundation Model Serving DBU rates and Throughput
Model | Pay-Per-Token Serving | Provisioned Throughput serving | ||
---|---|---|---|---|
DBU / 1M INPUT tokens (Global) | DBU / 1M OUTPUT tokens (Global) | DBU rate (Global) | Throughput Band1 (max tokens / sec)2 | |
Llama 3 70B | 14.286 | 42.857 | 212.143 | 670 |
DBRX | 32.143 | 96.429 | 212.143 | 600 |
Llama 2 70B | 28.571 | 28.571 | 157.143 | 635 |
Mixtral 8x7B | 21.429 | 21.429 | 290.857 | 1,700 |
Llama 3 8B | 3.571 | 10.714 | 106.000 | 3,600 |
MPT 30B | 14.286 | 14.286 | 112.000 | 580 |
Llama 2 13B | 13.571 | 13.571 | 78.571 | 1,580 |
MPT 7B | 7.143 | 7.143 | 20.000 | 2,450 |
BGE Large | 1.429 | 1.429 | N/A | N/A |
1: Throughput band is a model-specific maximum throughput (tokens per second) provided at the above per-hour price. With Provisioned Throughput Serving, model throughput is provided in increments of its specific "throughput band"; higher model throughput will require the customer to set an appropriate multiple of the throughput band which is then charged at the multiple of the per-hour price above.
2: Shown for serving on Azure. Some numbers are different on AWS when charged at a different price.
Pay-Per-Token Serving Pricing Examples
Model | Input tokens | Output tokens | Region | Unit price $ / DBU | Total Price |
---|---|---|---|---|---|
DBRX | 4,000,000 | 1,000,000 | US East | $0.070 | $15.75 |
Llama 2 70B | 4,000,000 | 1,000,000 | US East | $0.070 | $10.00 |
Mixtral 8x7B | 4,000,000 | 1,000,000 | AP (Sydney) | $0.088 | $9.43 |
Provisioned Throughput Serving Pricing Examples
Model | Hours / month | Region | Unit price $ / DBU | Monthly Price3 |
---|---|---|---|---|
DBRX | 720 | US East | $0.070 | $10,692 |
Llama 2 70B | 720 | US East | $0.070 | $7,920 |
Mixtral 8x7B | 720 | AP (Sydney) | $0.088 | $18,429 |
3: Per throughput band
Pay as you go with a 14-day free trial or contact us for committed-use discounts or custom requirements.
Mosaic AI Model Serving FAQ
Our regional prices are based on the regional cost of infrastructure supporting our serverless products.