Each row below is a Hugging Face repo ID you can pass asDocumentation Index
Fetch the complete documentation index at: https://docs.baseten.co/llms.txt
Use this file to discover all available pages before exploring further.
base_model when starting a Loops run. The table also lists the inference, trainer, and LoRA dtypes Baseten provisions for the model, plus the maximum supported sequence length. Baseten adds rows as new models are validated end to end.
Models
| Model | Max sequence length |
|---|---|
Qwen/Qwen3.6-35B-A3B | 131,072 |
Qwen/Qwen3.6-27B | 131,072 |
Qwen/Qwen3.5-9B | 131,072 |
Qwen/Qwen3.5-4B | 131,072 |
Qwen/Qwen3.5-2B | 131,072 |
Qwen/Qwen3.5-0.8B | 131,072 |
Qwen/Qwen3.5-122B-A10B | Contact support |
Qwen/Qwen3.5-397B-A17B | Contact support |
moonshotai/Kimi-K2.6 | Contact support |
Qwen/Qwen3-30B-Instruct-2507 | 131,072 |
deepseek-ai/DeepSeek-V4-Pro | Contact support |
deepseek-ai/DeepSeek-V4-Flash | Contact support |
zai-org/GLM-5.1 | Contact support |
MiniMaxAI/MiniMax-M2.7 | Contact support |
Dtypes
The trainer dtype is the precision used for forward, backward, and optimizer steps. The LoRA dtype is the precision of the adapter weights. The inference dtype is the precision the paired sampling server uses to serve checkpoints.Pass a model to Loops
Pass the table value verbatim asbase_model through any of the following entry points:
- The Python SDK, via
tinker.ServiceClient.create_lora_training_client(base_model=...). See the Loops quickstart. - The HTTP API, via
POST /v1/loops/runs. - The CLI, via
truss loops push <base_model>, which provisions a session, run, and paired sampler in one call.
sess_xyz789 with the session.id returned by POST /v1/loops/sessions: