Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

Loops is in early access. Fill out the signup form to request access for your workspace.
By the end of this page you’ll have a checkpoint stored in Baseten that you can list, download, or hand to an inference deployment. The base model throughout is Qwen/Qwen3.5-4B, one of the supported base models. Before you start, install Python 3.13+ and uv, then export two environment variables: BASETEN_API_KEY (a workspace key with org access to Loops) and LOOPS_PROJECT_ID (the ID of the training project you’re targeting).

Install

The main client package is baseten-loops on PyPI. The Tinker compatibility shim ships as the [tinker] extra (distributed as baseten-loops-tinker) and re-exports the public API under the tinker namespace, so existing import tinker scripts run unchanged. Install both with:
uv add 'baseten-loops[tinker]'
Verify the install:
import tinker
print(tinker.ServiceClient)
You should see:
<class 'tinker._service_client.ServiceClient'>

Provision a trainer

A Loops session pairs a trainer server (forward, backward, and optimizer steps) with a sampling server (generates from current weights). Constructing a ServiceClient and calling create_lora_training_client provisions both in one shot and returns a TrainingClient you can drive directly. The call returns as soon as the trainer is allocated; the trainer URL becomes responsive after a cold start of about five minutes, and subsequent SDK calls block until it’s ready. Start train_loops.py with the provision step:
import os
import tinker

PROJECT_ID = os.environ["LOOPS_PROJECT_ID"]
BASE_MODEL = "Qwen/Qwen3.5-4B"

service_client = tinker.ServiceClient(PROJECT_ID)
training_client = service_client.create_lora_training_client(
    base_model=BASE_MODEL,
    rank=16,
)

print(f"session_id={service_client.session_id}")
print(f"trainer_server_id={training_client.trainer_server_id}")
You’ll append the training and listing steps to this same file in the next two sections, then run the whole thing once at the end.

Run a training round trip

The smallest complete round trip is one forward pass, one backward pass, one optimizer step, and one weight save. The block below mirrors the canonical SFT example: it tokenizes a prompt-and-answer pair, masks the prompt positions from the loss, runs the round trip, and saves a named checkpoint. Append to train_loops.py:
def build_sft_datum(tokenizer, prompt, answer):
    p = tokenizer.encode(prompt, add_special_tokens=False)
    a = tokenizer.encode(answer, add_special_tokens=False)
    tokens = p + a
    targets = [-100] * len(p) + list(a)  # mask prompt, keep answer
    return tokens, targets

tokens, targets = build_sft_datum(
    training_client.get_tokenizer(),
    prompt="What is the capital of France?\nAnswer:",
    answer=" Paris",
)
datum = tinker.Datum(
    model_input=tinker.ModelInput.from_ints(tokens),
    loss_fn_inputs={
        "target_tokens": tinker.TensorData(
            data=targets, dtype="int64", shape=[len(targets)]
        )
    },
)

fb = training_client.forward_backward(data=[datum]).result(timeout=600.0)
print(f"loss={fb.loss:.6f}")

optim = training_client.optim_step(
    tinker.AdamParams(learning_rate=4e-5)
).result(timeout=600.0)
print(f"optim_metrics={optim.metrics}")

sampling_client = training_client.save_weights_and_get_sampling_client(
    name="step-1"
).result(timeout=600.0)
forward_backward is the first call that hits the trainer URL, so this is where you’ll wait through cold start the first time you run the script. When save_weights_and_get_sampling_client returns, the weights are committed as a named checkpoint and the sampling server is loaded with the new version.

List checkpoints

Every save_weights_and_get_sampling_client call creates a checkpoint. The TrainingClient is already bound to your trainer, so listing is one line. Append to train_loops.py:
for ckpt in training_client.list_checkpoints():
    print(ckpt.checkpoint_id, ckpt.created_at)
Now run the full script:
uv run python train_loops.py
The same listing is available from the HTTP API for scripts and CI pipelines that don’t run Python. Use the trainer_server_id your script printed when provisioning:
curl --request GET \
  --url "https://api.baseten.co/v1/loops/checkpoints?run_id=<trainer_server_id>" \
  --header "Authorization: Api-Key $BASETEN_API_KEY"
The HTTP API calls trainer servers “runs”, so the query parameter is run_id — the same value the SDK exposes as trainer_server_id. The response is {"checkpoints": [...]}. Each checkpoint carries an id (the string you passed to save_weights_and_get_sampling_client), a run_id, a created_at timestamp, and the base model, size, and adapter config. To fetch the actual weight files, pass that id to training_client.get_checkpoint_archive_url(checkpoint_id). From a separate Python session where training_client isn’t in scope, use ServiceClient.get_checkpoint_archive_url(trainer_server_id, checkpoint_id) instead.

Next steps

The Loops concepts page explains the paired-process model in detail: how sessions own trainer and sampling servers, how weight sync works, and how checkpoints land as unzipped folders of paginated presigned URLs rather than single archives. Reading it will make the resource IDs in this quickstart feel less arbitrary. If you’re migrating from Tinker, the Tinker compatibility page documents what carries over exactly (forward, backward, optim step, sampling, data types) and what behaves differently (checkpoint layout, authentication, cluster routing). The import tinker path used here already covers most cookbook recipes; that page names the three places where behavior has changed. When you’re ready to call the HTTP API directly (for scripting deployments, fetching checkpoint files programmatically, or integrating Loops into a CI pipeline), the Loops API overview covers each route’s path, request body, response shape, and authentication scope in one place.