Streaming AI Chat with Model Serving

Build a streaming AI chat experience using AI SDK and Databricks Model Serving endpoints.

Prerequisites

Complete these prerequisite templates first:

Set Up Your Local Dev Environment — install the Databricks CLI and authenticate a profile.
Query AI Gateway Endpoints — confirm your workspace exposes a chat endpoint via the AI Gateway.

Then verify these Databricks workspace features are enabled. If any check fails, ask your workspace admin to enable the feature.

Databricks CLI authenticated. Run databricks auth profiles and confirm at least one profile shows Valid: YES. If none do, authenticate with databricks auth login --host <workspace-url> --profile <PROFILE>.
An OpenAI-compatible chat endpoint in Model Serving. Run databricks serving-endpoints list --profile <PROFILE> and confirm at least one OpenAI-compatible chat endpoint is listed (e.g. databricks-gpt-5-4-mini, databricks-meta-llama-3-3-70b-instruct, or databricks-claude-sonnet-4). Endpoint availability varies by workspace and region; note the one you plan to set as DATABRICKS_ENDPOINT.
Databricks Apps enabled. Run databricks apps list --profile <PROFILE> and confirm the command succeeds (an empty list is fine). A permission or not enabled error means Apps is not available to this identity in this workspace.

Streaming AI Chat with Model Serving

Build a streaming AI chat experience in a Databricks App using Vercel AI SDK with Databricks Model Serving and OpenAI-compatible endpoints.

1. Install AI SDK packages

bash
npm install ai@6 @ai-sdk/react@3 @ai-sdk/openai @databricks/sdk-experimental

Version note: This template uses AI SDK v6 APIs (TextStreamChatTransport, sendMessage({ text }), transport-based useChat). Tested with [email protected], @ai-sdk/[email protected], and @ai-sdk/[email protected].

Note: @databricks/sdk-experimental is included in the scaffolded package.json. It is listed here for reference if adding AI chat to an existing project.

Optional: For pre-built chat UI components, initialize shadcn and add AI Elements:
bash
npx shadcn@latest init
This basic template works without AI Elements. They are optional prebuilt components.

2. Configure environment variables for AI Gateway

Configure your Databricks workspace ID and model endpoint:

For local development (.env):

bash
echo 'DATABRICKS_WORKSPACE_ID=<your-workspace-id>' >> .env
echo 'DATABRICKS_ENDPOINT=<your-endpoint>' >> .env
echo 'DATABRICKS_CONFIG_PROFILE=DEFAULT' >> .env

For deployment in Databricks Apps (app.yaml):

yaml
env:
  - name: DATABRICKS_WORKSPACE_ID
    value: "<your-workspace-id>"
  - name: DATABRICKS_ENDPOINT
    value: "<your-endpoint>"

Workspace ID: AppKit auto-discovers this at runtime. For explicit setup, run databricks api get /api/2.1/unity-catalog/current-metastore-assignment --profile <PROFILE> and use the workspace_id field.

Model compatibility: This template uses OpenAI-compatible models served via Databricks AI Gateway, which support the AI SDK's streaming API. The AI Gateway URL uses the /mlflow/v1 path (not /openai/v1).

Find your endpoint: Run databricks serving-endpoints list --profile <PROFILE> to see available models. Common endpoints include databricks-meta-llama-3-3-70b-instruct and databricks-claude-sonnet-4, but availability varies by workspace.

3. Configure authentication helper

Create a helper function that works for both local development and deployed apps:

typescript
import { Config } from "@databricks/sdk-experimental";

async function getDatabricksToken() {
  // For deployed apps, use service principal token
  if (process.env.DATABRICKS_TOKEN) {
    return process.env.DATABRICKS_TOKEN;
  }

  // For local dev, use CLI profile auth via Databricks SDK
  const config = new Config({
    profile: process.env.DATABRICKS_CONFIG_PROFILE || "DEFAULT",
  });
  await config.ensureResolved();
  const headers = new Headers();
  await config.authenticate(headers);
  const authHeader = headers.get("Authorization");
  if (!authHeader) {
    throw new Error(
      "Failed to get Databricks token. Check your CLI profile or set DATABRICKS_TOKEN.",
    );
  }
  return authHeader.replace("Bearer ", "");
}

This function uses the Databricks SDK auth chain, which reads ~/.databrickscfg profiles and handles OAuth token refresh. For deployed apps, set DATABRICKS_TOKEN directly.

User identity in deployed apps: Databricks Apps injects user identity via request headers. Extract it with req.header("x-forwarded-email") or req.header("x-forwarded-user"). Use this for chat persistence and access control.

4. Add `/api/chat` route with streaming

Create a server route using the AI SDK's streaming support:

typescript
import { createOpenAI } from "@ai-sdk/openai";
import { streamText, type UIMessage } from "ai";

app.post("/api/chat", async (req, res) => {
  const { messages } = req.body;

  // AI SDK v6 client sends UIMessage objects with a parts array.
  // Convert to CoreMessage format for streamText().
  const coreMessages = (messages as UIMessage[]).map((m) => ({
    role: m.role as "user" | "assistant" | "system",
    content:
      m.parts
        ?.filter((p) => p.type === "text" && p.text)
        .map((p) => p.text)
        .join("") ??
      m.content ??
      "",
  }));

  try {
    const token = await getDatabricksToken();
    const endpoint = process.env.DATABRICKS_ENDPOINT || "<your-endpoint>";

    // Configure Databricks AI Gateway as OpenAI-compatible provider
    const databricks = createOpenAI({
      baseURL: `https://${process.env.DATABRICKS_WORKSPACE_ID}.ai-gateway.cloud.databricks.com/mlflow/v1`,
      apiKey: token,
    });

    // Stream the response using AI SDK v6
    const result = streamText({
      model: databricks.chat(endpoint),
      messages: coreMessages,
      maxOutputTokens: 1000,
    });

    // v6 API: pipe the text stream to the Express response
    result.pipeTextStreamToResponse(res);
  } catch (err) {
    const message = (err as Error).message;
    console.error(`[chat] Streaming request failed:`, message);
    res.status(502).json({
      error: "Chat request failed",
      detail: message,
    });
  }
});

5. Render the streaming chat UI

Use useChat from the AI SDK with TextStreamChatTransport for streaming support:

tsx
import { useChat } from "@ai-sdk/react";
import { TextStreamChatTransport } from "ai";
import { useState } from "react";

export function ChatPage() {
  const [input, setInput] = useState("");

  const { messages, sendMessage, status } = useChat({
    transport: new TextStreamChatTransport({ api: "/api/chat" }),
  });

  return (
    <div className="flex flex-col h-full">
      <div className="flex-1 overflow-y-auto space-y-4 p-4">
        {messages.map((m) => (
          <div key={m.id} className={m.role === "user" ? "text-right" : ""}>
            <span className="text-sm font-medium">
              {m.role === "user" ? "You" : "Assistant"}
            </span>
            {m.parts.map((part, i) =>
              part.type === "text" ? (
                <p key={`${m.id}-${i}`} className="whitespace-pre-wrap">
                  {part.text}
                </p>
              ) : null,
            )}
          </div>
        ))}
        {status === "submitted" && <div className="p-4">Loading...</div>}
      </div>
      <form
        onSubmit={(e) => {
          e.preventDefault();
          if (input.trim()) {
            void sendMessage({ text: input });
            setInput("");
          }
        }}
        className="border-t p-4 flex gap-2"
      >
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask a question..."
          className="flex-1 border rounded px-3 py-2"
          disabled={status !== "ready"}
        />
        <button type="submit" disabled={status !== "ready"}>
          {status === "submitted" || status === "streaming"
            ? "Sending..."
            : "Send"}
        </button>
      </form>
    </div>
  );
}

6. Deploy and verify

bash
databricks apps deploy --profile <PROFILE>
databricks apps list --profile <PROFILE>
databricks apps logs <app-name> --profile <PROFILE>

Open the app URL while signed in to Databricks, send a message, and verify streaming responses appear token-by-token from the AI Gateway endpoint.

Streaming AI Chat with Model Serving

Prerequisites

Streaming AI Chat with Model Serving

1. Install AI SDK packages

2. Configure environment variables for AI Gateway

3. Configure authentication helper

4. Add /api/chat route with streaming

5. Render the streaming chat UI

6. Deploy and verify

References

4. Add `/api/chat` route with streaming