Get started

By the end of this guide, you’ll have created a Frontier Gateway group for one of your downstream customers, minted an API key bound to that group, and called your Dedicated deployment through the gateway with the key. From here, you can build a deeper group hierarchy, configure additional rate and usage limits, set up billing webhooks, and explore the full lifecycle.

Prerequisites

A Dedicated deployment of your model on Baseten.
A Baseten workspace API key with management scope, exported as BASETEN_API_KEY.
Completed Frontier Gateway onboarding with your Baseten team.

This guide assumes you’ve finished managed onboarding: your workspace is provisioned for federated keys, and your webhook signing secret is in place. If you haven’t started yet, talk to us. The /v1/gateway/ endpoints used here return 403 to workspaces that aren’t onboarded.

Step 1: Create a group

A group is the resource you create per customer, plan, project, or whichever unit of your organizational hierarchy maps to a billing or access boundary. The group owns an external identifier (your stable ID for this entity), the model slugs it’s allowed to call, and the rate and usage limits enforced on every call. API keys are minted under the group in step 2. Create a group with POST /v1/gateway/groups. The request takes a metadata block (display name plus the external identifier), a non-empty models list pairing each model slug with its rate and usage limits, and a hierarchy block declaring the inheritance mode and an optional parent. This example creates a top-level (root) group with independent enforcement.

curl --request POST \
  --url https://api.baseten.co/v1/gateway/groups \
  --header "Authorization: Api-Key $BASETEN_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "metadata": {
      "name": "Acme prod",
      "external_entity_id": "cust_42"
    },
    "models": [
      {
        "slug": "your-org/your-model",
        "rate_limits": [
          { "type": "TOKEN", "unit": "MINUTE", "threshold": 1000000 },
          { "type": "REQUEST", "unit": "MINUTE", "threshold": 100 }
        ],
        "usage_limits": [
          { "type": "TOKEN", "unit": "DAY", "threshold": 10000000 }
        ]
      }
    ],
    "hierarchy": {
      "limit_enforcement": "INDEPENDENT",
      "parent_group_id": null
    }
  }'

The response is the new group, including the internal id you’ll use as the path parameter when minting keys:

{
  "id": "abc123hash",
  "metadata": {
    "name": "Acme prod",
    "external_entity_id": "cust_42"
  },
  "models": [
    {
      "slug": "your-org/your-model",
      "rate_limits": [
        { "type": "TOKEN", "unit": "MINUTE", "threshold": 1000000 },
        { "type": "REQUEST", "unit": "MINUTE", "threshold": 100 }
      ],
      "usage_limits": [
        { "type": "TOKEN", "unit": "DAY", "threshold": 10000000 }
      ]
    }
  ],
  "effective_models": [
    {
      "slug": "your-org/your-model",
      "rate_limits": [
        { "type": "TOKEN", "unit": "MINUTE", "threshold": 1000000, "source_group": "abc123hash" },
        { "type": "REQUEST", "unit": "MINUTE", "threshold": 100, "source_group": "abc123hash" }
      ],
      "usage_limits": [
        { "type": "TOKEN", "unit": "DAY", "threshold": 10000000, "source_group": "abc123hash" }
      ]
    }
  ],
  "hierarchy": {
    "limit_enforcement": "INDEPENDENT",
    "parent_group_id": null
  },
  "created_at": "2026-05-13T12:00:00Z"
}

Save the id. You’ll need it in step 2. The effective_models block shows the limits the runtime enforces after inheritance; for a root group it matches models exactly. See Rate and usage limits for how this changes once you add a parent.

Step 2: Mint an API key for the group

Issue a new API key under the group with POST /v1/gateway/groups/{group_id}/api_keys. The key inherits the group’s effective model set and limits; you don’t configure either on the key itself.

curl --request POST \
  --url https://api.baseten.co/v1/gateway/groups/abc123hash/api_keys \
  --header "Authorization: Api-Key $BASETEN_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "prod-key-1"
  }'

The response contains the plaintext key, returned exactly once:

{
  "api_key": "aBcDeFg.<api-key-secret>",
  "prefix": "aBcDeFg",
  "name": "prod-key-1"
}

This is the only time the key is returned in plaintext. Save it now: Baseten doesn’t store the secret portion and can’t show it to you again. If you lose it, revoke the key and mint a new one.

The string before the . (here, aBcDeFg) is the prefix. You’ll use the prefix, not the full key, when fetching or revoking the key later.

Step 3: Call your model through the gateway

Use the API key from step 2 to call your model. Frontier Gateway is OpenAI-compatible, so the OpenAI SDK works with the gateway base URL. Replace YOUR_API_KEY in the examples below with the value you saved from the mint-key response.

Python
curl

Install the OpenAI SDK:

pip install openai

Make a chat completion request:

chat.py

from openai import OpenAI

client = OpenAI(
    base_url="https://inference.baseten.co/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="your-org/your-model",
    messages=[{"role": "user", "content": "Hello, world!"}],
)

print(response.choices[0].message.content)

curl --request POST \
  --url https://inference.baseten.co/v1/chat/completions \
  --header "Content-Type: application/json" \
  --header "Authorization: Api-Key YOUR_API_KEY" \
  --data '{
    "model": "your-org/your-model",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ]
  }'

The base URL is https://inference.baseten.co/v1 today. Once white-label routing is provisioned for your workspace, the base URL becomes the branded domain you configure with your Baseten team, and your downstream customers call your domain instead.

Next steps

Manage groups and API keys: Build a multi-level hierarchy, mint and revoke keys, and delete groups.
Rate and usage limits: Tune per-group, per-model thresholds and pick an inheritance mode.
Billing webhooks: Stream signed per-request usage events into your billing pipeline.

About Baseten

Model APIs

Inference

Development

Deployment

Engines

Frontier Gateway

Training

Organization

Observability

Troubleshooting

Get started

Prerequisites

Step 1: Create a group

Step 2: Mint an API key for the group

Step 3: Call your model through the gateway

Next steps

Get started

About Baseten

Model APIs

Inference

Development

Deployment

Engines

Frontier Gateway

Training

Organization

Observability

Troubleshooting

Documentation Index

​Prerequisites

​Step 1: Create a group

​Step 2: Mint an API key for the group

​Step 3: Call your model through the gateway

​Next steps

Prerequisites

Step 1: Create a group

Step 2: Mint an API key for the group

Step 3: Call your model through the gateway

Next steps