Skip to main content
The truss train command provides subcommands for managing the full training job lifecycle.
truss train [COMMAND] [OPTIONS]

Universal options

The following options are available for all truss train commands:
  • --help: Show help message and exit.
  • --non-interactive: Disable interactive prompts (for CI/automated environments).
  • --remote TEXT: Name of the remote in .trussrc.

init

Initialize a training project from templates or create an empty project.
truss train init [OPTIONS]

Options

--examples
string
Template name or comma-separated list of templates to initialize. See the ML Cookbook for available examples.
--target-directory
string
Directory to initialize the project in. Defaults to current directory.
--list-examples
List all available example templates.

Examples

Initialize a project from a template:
truss train init --examples qwen3-8b-lora-dpo-trl
Initialize multiple templates:
truss train init --examples qwen3-8b-lora-dpo-trl,qwen3-8b-lora-verl
List available templates:
truss train init --list-examples
Create an empty training project:
truss train init

push

Submit and run a training job.
truss train push [OPTIONS] CONFIG

Arguments

CONFIG
string
required
Path to the training configuration file (e.g., config.py).

Options

--tail
Stream status and logs after submitting the job.
--job-name
string
Name for the training job.
--team
string
Team name for the training project. If not specified, Truss infers the team or prompts for selection.
The --team flag is only available if your organization has teams enabled. Contact us to enable teams, or see Teams for more information.
--interactive
string
Enable an interactive rSSH session on the training job. Options: on_startup, on_failure, on_demand.
--interactive-timeout-minutes
integer
Session timeout in minutes. Defaults to 480 (8 hours).
--entrypoint
string
Override the training job’s entrypoint command. Use "bash" with --interactive for a clean container to experiment in before running anything.
--accelerator
string
GPU type and count in TYPE:COUNT format (e.g., H200:8).
--node-count
integer
Number of compute nodes for the training job.

Examples

Submit a training job:
truss train push config.py
Submit and stream logs:
truss train push config.py --tail
Submit to a specific team:
truss train push config.py --team my-team-name
Submit with a custom job name:
truss train push config.py --job-name fine-tune-v1

logs

Fetch and stream logs from a training job.
truss train logs [OPTIONS]

Options

--job-id
string
Job ID to fetch logs from.
--project
string
Project name or project ID.
--project-id
string
Project ID.
--tail
Continuously stream new logs.

Examples

Stream logs for a specific job:
truss train logs --job-id abc123 --tail
View logs for a job without streaming:
truss train logs --job-id abc123

metrics

View real-time metrics for a training job including CPU, GPU, and storage usage.
truss train metrics [OPTIONS]

Options

--job-id
string
Job ID to fetch metrics from.
--project
string
Project name or project ID.
--project-id
string
Project ID.

Examples

View metrics for a specific job:
truss train metrics --job-id abc123

view

List training projects and jobs, or view details for a specific job.
truss train view [OPTIONS]

Options

--job-id
string
View details for a specific training job.
--project
string
View jobs for a specific project (name or ID).
--project-id
string
View jobs for a specific project ID.

Examples

List all training projects:
truss train view
View jobs in a specific project:
truss train view --project my-project
View details for a specific job:
truss train view --job-id abc123

stop

Stop a running training job.
truss train stop [OPTIONS]

Options

--job-id
string
Job ID to stop.
--project
string
Project name or project ID.
--project-id
string
Project ID.
--all
Stop all running jobs. Prompts for confirmation.

Examples

Stop a specific job:
truss train stop --job-id abc123
Stop all running jobs:
truss train stop --all

recreate

Recreate an existing training job with the same configuration.
truss train recreate [OPTIONS]

Options

--job-id
string
Job ID of the training job to recreate. If not provided, defaults to the last created job.
--tail
Stream status and logs after recreating the job.

Examples

Recreate a specific job:
truss train recreate --job-id abc123
Recreate and stream logs:
truss train recreate --job-id abc123 --tail

download

Download training job artifacts to your local machine.
truss train download [OPTIONS]

Options

--job-id
string
required
Job ID to download artifacts from.
--target-directory
path
Directory to download files to. Defaults to current directory.
--no-unzip
Keep the compressed archive without extracting.

Examples

Download artifacts to current directory:
truss train download --job-id abc123
Download to a specific directory:
truss train download --job-id abc123 --target-directory ./downloads
Download without extracting:
truss train download --job-id abc123 --no-unzip

deploy_checkpoints

Deploy a trained model checkpoint to Baseten’s inference platform.
truss train deploy_checkpoints [OPTIONS]

Options

--job-id
string
Job ID containing the checkpoints to deploy.
--project
string
Project name or project ID.
--project-id
string
Project ID.
--config
string
Path to a Python file defining a DeployCheckpointsConfig.
--dry-run
Generate a Truss config without deploying. Useful for previewing the deployment configuration.
--truss-config-output-dir
string
Path to output the generated Truss config. Defaults to truss_configs/<model_version_name>_<model_version_id>.

Examples

Deploy checkpoints interactively:
truss train deploy_checkpoints
Deploy checkpoints from a specific job:
truss train deploy_checkpoints --job-id abc123
Preview deployment without deploying:
truss train deploy_checkpoints --job-id abc123 --dry-run

get_checkpoint_urls

Get presigned URLs for checkpoint artifacts.
truss train get_checkpoint_urls [OPTIONS]

Options

--job-id
string
Job ID containing the checkpoints.

Examples

Get checkpoint URLs for a job:
truss train get_checkpoint_urls --job-id abc123

checkpoints list

List and interactively explore checkpoints for a training job.
truss train checkpoints list [OPTIONS]

Options

--job-id
string
Job ID to list checkpoints for. If omitted, defaults to the most recently created job. If multiple jobs exist and no --project-id or --project is provided, defaults to the most recently created job across all projects and prints its ID as a warning.
--project
string
Project name or project ID.
--project-id
string
Project ID.
--checkpoint-name
string
Jump directly into a specific checkpoint’s file explorer.
--sort
string
Sort checkpoints by column. Options: checkpoint-id, size, created, type. Defaults to created.
--order
string
Sort order: asc (ascending) or desc (descending). Defaults to asc.
--output-format, -o
string
Output format: cli-table (default, interactive), csv, or json.

Interactive mode

When using the default cli-table format in an interactive terminal, the command launches a checkpoint explorer:
  1. Checkpoint picker: fuzzy-search and select a checkpoint from the list.
  2. File explorer: navigate the checkpoint’s directory tree. Press or Enter to open a directory or view a file. Press to go back. Press Ctrl-C to quit.
For .safetensors files, the explorer displays a tensor summary (layer names, dtypes, shapes, and parameter counts) instead of raw binary content. Text files display with syntax highlighting based on their file extension (for example, .json, .py, .yaml, .toml), falling back to plain text for unrecognized types.

Examples

List checkpoints for the most recent job:
truss train checkpoints list
List checkpoints for a specific job:
truss train checkpoints list --job-id abc123
Jump directly into a checkpoint’s files:
truss train checkpoints list --job-id abc123 --checkpoint-name ckpt-001
Export checkpoint list as JSON:
truss train checkpoints list --job-id abc123 --output-format json
Sort by size descending:
truss train checkpoints list --job-id abc123 --sort size --order desc

cache summarize

View a summary of the training cache for a project.
truss train cache summarize [OPTIONS] PROJECT

Arguments

PROJECT
string
required
Project name or project ID.

Options

--sort
string
Sort files by column. Options: filepath, size, modified, type, permissions.
--order
string
Sort order: asc (ascending) or desc (descending).
--output-format, -o
string
Output format: cli-table (default), csv, or json.

Examples

View cache summary:
truss train cache summarize my-project
Sort by size descending:
truss train cache summarize my-project --sort size --order desc
Export as JSON:
truss train cache summarize my-project --output-format json

isession

View interactive session details for a training job, including auth codes and connection status. Can also update session configuration.
truss train isession [OPTIONS]

Options

--job-id
string
required
Job ID to view interactive session details for.
--update-timeout
integer
Minutes to extend the session timeout by.
--update-trigger
string
Change the session trigger. Options: on_startup, on_failure, on_demand. Cannot be changed on on_startup sessions.
--format
string
Output format: table (default) or json.

Examples

View session details for a job:
truss train isession --job-id abc123
Extend session timeout:
truss train isession --job-id abc123 --update-timeout 60
Output as JSON:
truss train isession --job-id abc123 --format json

update_session

Update the interactive session configuration on a running training job. At least one of --trigger or --timeout-minutes must be provided.
truss train update_session [OPTIONS] JOB_ID

Arguments

JOB_ID
string
required
Job ID of the training job to update.

Options

--trigger
string
New trigger mode for the session. Options: on_startup, on_failure, on_demand.
--timeout-minutes
integer
Number of minutes before the interactive session times out.

Examples

Change the session trigger:
truss train update_session abc123 --trigger on_startup
Update the session timeout:
truss train update_session abc123 --timeout-minutes 120
truss train update_session requires API support that may not be available in all environments. If you receive a 404 error, set the trigger mode at push time using --interactive on_startup or --interactive on_failure instead.

Ignore files and folders

Create a .truss_ignore file in your project root to exclude files from upload. Uses .gitignore syntax.
.truss_ignore
# Python cache files
__pycache__/
*.pyc
*.pyo
*.pyd

# Type checking
.mypy_cache/

# Testing
.pytest_cache/

# Large data files
data/
*.bin