Fine-tuning overview
Fine-tuning lets you train a model on your data using platform-supported base models. You submit a job (dataset + config), the platform runs training, and you can deploy the resulting model for inference.Supported models
Only models that are marked as fine-tunable in the platform catalog can be used as base models. The dashboard and the models API list which models support fine-tuning. Use one of those model IDs when starting a job; others will be rejected. Models are described generically in the catalog (e.g. by size and capability). Check the dashboard or API for the current list of fine-tunable base models.How it works
- Dataset — You provide training data in a supported format (e.g. JSONL). See Dataset format.
- Submit job — You call the training start endpoint with base model, dataset reference, and hyperparameters (e.g. epochs, batch size, learning rate).
- Execution — The platform runs training on its infrastructure. You do not manage GPUs or nodes.
- Monitor — You poll the job status endpoint or use webhooks to get progress and metrics (e.g. loss, steps).
- Deploy — When the job completes, you can deploy the trained model for inference via the usual inference endpoints.
Training workflow
| Step | Description |
|---|---|
| Prepare data | Format your dataset (e.g. JSONL with messages or prompt/completion pairs). |
| Upload or reference | Upload via the dashboard or provide a URL/path the platform can access. |
| Configure job | Choose base model, training method (e.g. SFT), epochs, batch size, learning rate. |
| Start job | POST to the training start endpoint; you receive a job ID. |
| Monitor | GET job status; optionally register a webhook for training.completed / training.failed. |
| Deploy | Use the trained model ID (or deployment flow) for inference. |
API endpoints (conceptual)
- Start training —
POST /api/fine-tune/training/startor equivalent (see API reference). Body: base model, dataset, hyperparameters, optional webhook URL. - Get job status —
GET /api/training/{jobId}. Response: status, progress, metrics, error if failed. - Cancel —
POST /api/training/{jobId}/cancel(if supported).
Limitations
- Only platform-supported base models can be fine-tuned.
- Dataset must conform to the required format and size limits.
- Training runs on platform infrastructure; you cannot bring your own cluster.
- Cost is based on usage (e.g. tokens or job type); balance must be sufficient.