Skip to main content

Fine-tuning workflow

1. Prepare the dataset

  • Format data as required (e.g. JSONL). See Dataset format.
  • Upload via the dashboard or prepare a URL the platform can access.
  • Note the dataset ID or URL you will pass to the start endpoint.

2. Configure and start the job

Send a POST request to the training start endpoint with at least:
  • Base model — A supported model ID from the catalog.
  • Dataset — Reference (ID or URL) to your training data.
  • Training method — e.g. SFT (supervised fine-tuning).
  • Hyperparameters — Epochs, batch size, learning rate, etc., as supported.
Optional:
  • Webhook URL — To receive training.completed and training.failed events.
You receive a job ID in the response. Use it to poll status and to cancel if needed.

3. Monitor training

  • Polling — Call GET /api/training/{jobId} (or equivalent) to get:
    • status — e.g. pending, training, completed, failed.
    • progress — Percentage or step count.
    • metrics — e.g. loss, tokens/sec (when available).
    • error — Message if the job failed.
  • Webhooks — If you registered a webhook, you get events when the job completes or fails. Payload typically includes job ID, status, and basic metrics.

4. Deploy the trained model

When the job status is completed:
  • The platform may expose the trained model under a new model ID or artifact.
  • Use the dashboard or the deployment/inference APIs to make the model available for inference.
  • Then call the usual inference endpoints with that model ID.
If the job failed, check the error field and any webhook payload; fix the dataset or config and resubmit.

5. Cost and balance

Training consumes your account balance. Ensure sufficient credits before starting; otherwise the job may not start or may fail. Usage is tracked per job; see Billing & usage.