baseten model-api

List and inspect Baseten Model APIs. Authenticate with baseten auth login or the BASETEN_API_KEY environment variable.

describe

baseten model-api describe [OPTIONS]

Describe a single Model API by name.

Options

-q, --jq

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

--model

TEXT

required

Name of the Model API to describe.

-o, --output

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

--profile

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

-v, --verbose

BOOL

Enable verbose logging

Examples

Describe a Model API by name

baseten model-api describe --model <name>

Filter output with `--jq`

Print the Model API’s invoke URL

baseten model-api describe --model <name> --jq '.invoke_url'

Output

Text mode (--output text): Field-per-line summary of the Model API. JSON mode (--output json): payload type managementapi.ModelAPI.

list

baseten model-api list [OPTIONS]

List the Model APIs the workspace has added. Pass --all to browse the full visible catalog instead of just the added ones.

Options

--all

BOOL

Browse the full visible catalog instead of only the Model APIs the workspace has added.

-q, --jq

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

-o, --output

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

--profile

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

-v, --verbose

BOOL

Enable verbose logging

Examples

List the Model APIs the workspace has added

baseten model-api list

Browse the full visible catalog

baseten model-api list --all

Filter output with `--jq`

Print just the Model API names

baseten model-api list --jq '.items[].name'

Output

Text mode (--output text): Table with columns: NAME, CONTEXT,

/1M IN,

/1M OUT, ADDED. When no Model APIs match, prints “No Model APIs found.” to stderr. JSON mode (--output json): payload type cmd.ModelAPIList.

predict

baseten model-api predict [OPTIONS]

POST an inference request to a Model API and write the response to stdout. The request is sent to --url, which defaults to the OpenAI chat-completions endpoint on the shared inference host. Override it for other shapes (e.g. /v1/messages, /v1/embeddings) or different hosts. --content is the simple path: it builds an OpenAI chat-completions body with a single user message and --model as the model, and prints just the assistant’s reply. It is only valid for OpenAI chat URLs and requires --model. --data and --file send a request body verbatim, so any format the endpoint accepts works (OpenAI, Anthropic, embeddings, custom). The response is written as-is: JSON is pretty-printed, streams and binary bodies are passed through.

Options

--content

TEXT

Single user message; builds an OpenAI chat-completions request and prints the assistant’s reply. Only valid for OpenAI chat URLs and requires —model.Mutually exclusive with other flags in group predict-input.

--data

TEXT

Inline request body, sent verbatim.Mutually exclusive with other flags in group predict-input.

--file

TEXT

Path to a file containing the request body, sent verbatim. Use ’-’ for stdin.Mutually exclusive with other flags in group predict-input.

-q, --jq

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

--model

TEXT

Name of the Model API. Required with —content, where it sets the request’s model.

-o, --output

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

--profile

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

--url

TEXT

Endpoint to POST the request to. Defaults to https://inference.baseten.co/v1/chat/completions.

-v, --verbose

BOOL

Enable verbose logging

Examples

Send a single user message

baseten model-api predict --model <name> --content "hello"

Send a full OpenAI-shaped body and stream it as JSONL

baseten model-api predict --model <name> --data '{"model":"<name>","messages":[{"role":"user","content":"hi"}],"stream":true}' --output jsonl

Filter output with `--jq`

Extract the assistant’s message content

baseten model-api predict --model <name> --content "hi" --jq '.choices[0].message.content'

Output

Text mode (--output text): With --content, the assistant message text. With --data/--file, the response body as-is (pretty-printed JSON, or a raw stream/binary body). JSON mode (--output json): payload type cmd.JSONUndefined. Under --output json, --content emits the full chat-completions response. For --data/--file, a streamed response becomes one JSON record per chunk under --output jsonl, and a binary body is base64-encoded under a ‘body’ key.

​describe

​Options

​Examples

​Filter output with --jq

​Output

​list

​Options

​Examples

​Filter output with --jq

​Output

​predict

​Options

​Examples

​Filter output with --jq

​Output

describe

Options

Examples

Filter output with `--jq`

Output

list

Options

Examples

Filter output with `--jq`

Output

predict

Options

Examples

Filter output with `--jq`

Output