Get deployment metrics

curl --request POST \ --url https://api.baseten.co/v1/loops/deployments/{deployment_id}/metrics \ --header "Authorization: Bearer $BASETEN_API_KEY" \ --data '{ "end_epoch_millis": null, "start_epoch_millis": null, "step_seconds": null, "time_divisor_seconds": null }'

{ "deployment_id": "<string>", "metrics": { "inference_volume": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "concurrent_requests": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "response_time_stats": [ { "timestamp": "2023-11-07T05:31:56Z", "p50": 123, "p95": 123, "p99": 123 } ], "inference_volume_by_status": [ { "timestamp": "2023-11-07T05:31:56Z", "status_2xx": 123, "status_4xx": 123, "status_5xx": 123 } ], "gpu_memory_usage_bytes": {}, "gpu_utilization": {}, "cpu_usage": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "cpu_memory_usage_bytes": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "ephemeral_storage": { "usage_bytes": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "utilization": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ] }, "per_node_metrics": [ { "node_id": "<string>", "gpu_memory_usage_bytes": {}, "gpu_utilization": {}, "cpu_usage": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "cpu_memory_usage_bytes": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "ephemeral_storage": { "usage_bytes": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ], "utilization": [ { "value": 123, "timestamp": "2023-11-07T05:31:56Z" } ] } } ] } }

Authorizations

Authorization

string

header

required

Pass your Baseten API key. Clients automatically send Authorization: Bearer <key>. Direct callers can also use Authorization: Api-Key <key>; both schemes are accepted.

Path Parameters

deployment_id

string

required

Body

application/json

Time-range request for trainer deployment metrics.

end_epoch_millis

integer | null

Epoch millis to end fetching metrics.

start_epoch_millis

integer | null

Epoch millis to start fetching metrics.

step_seconds

integer | null

Resolution of the returned series, in seconds. When omitted, a step is derived from the time range so large windows return fewer points.

time_divisor_seconds

integer | null

Unit of time for request-volume metrics, in seconds (e.g. 60 for requests/minute). Defaults to per-second.

Response

200 - application/json

Response for POST /v1/loops/deployments/<id>/metrics.

deployment_id

string

required

The trainer deployment ID.

metrics

LoopsDeploymentMetricsV1 · object

required

Metrics for the deployment.

Show child attributes