Fine-Tuning Guide
This step-by-step tutorial walks through configuring and launching a training job in FineTune Studio.Prerequisites
- LLMTune account and workspace
- At least one dataset (uploaded via Dataset Hub or using playground datasets)
- Understanding of dataset structure (see Datasets)
Step 1: Access FineTune Studio
- Navigate to FineTune Studio from the main navigation or visit
/finetune-studio. - You’ll see the product overview with information about training methods, modalities, and compute options.
Step 2: Choose Base Model
- In FineTune Studio, review the recommended models.
- Each model card shows:
- Provider (IO Intelligence, DeepSeek, Mistral, etc.)
- Context length
- Latency expectations
- Recommended use cases
- Select the foundation model that fits your task. Record the model ID for reference.
Step 3: Select Training Method
FineTune Studio supports 13 training methods:Core Methods
- SFT (Supervised Fine-Tuning) – Classic instruction tuning with labeled demonstrations
- DPO (Direct Preference Optimization) – Align to preferences without reinforcement
- PPO (Proximal Policy Optimization) – Reinforcement learning with reward-driven updates
- RLAIF (RL with AI Feedback) – Use AI to provide feedback for RL training
- CTO (Controlled Tuning Optimization) – Fine-grained control over model behavior
Specialized Methods
- Code Generation – Train models to generate code from natural language
- Multimodal – Train vision-language models with text and images
- Text-to-Embeddings – Train embedding models for semantic search and RAG
- Audio Understanding – Train models to understand and reason about audio
- Audio-to-Text (ASR) – Train Automatic Speech Recognition models
- Text-to-Audio (TTS) – Generate speech from text (uses Coqui TTS service)
- Video Understanding – Train video-language models (coming soon)
- Reward Modeling – Train reward functions for RLHF pipelines
Step 4: Select Dataset
- Choose from:
- Uploaded datasets from Dataset Hub
- Playground datasets (pre-configured demo datasets for each training method)
- If using playground datasets, they are automatically selected based on your training method.
- For custom datasets, ensure they match the expected format for your chosen method:
- SFT/DPO/PPO: JSONL with
messagesorconversationsarrays - Code Generation: Code examples with prompts
- Multimodal: Text + image pairs
- Audio methods: Audio file paths with transcripts or labels
- SFT/DPO/PPO: JSONL with
Step 5: Configure Training Parameters
Key settings:| Setting | Description |
|---|---|
| Training method | SFT, DPO, PPO, RLAIF, CTO, Code, Multimodal, etc. |
| Learning rate | Default varies by method; typically 0.0001 for SFT |
| Batch size | Adaptive based on model size and dataset |
| Epochs | Typically 2–5 for most methods; validate with evaluation |
| Evaluation cadence | Frequency of validation runs during training |
Step 6: Choose Compute Model
FineTune Studio offers flexible compute options:Traditional Computing
- Single Instance – One GPU instance, perfect for smaller models and testing
- GPU Cluster – Multiple GPUs working together, faster training for large models
- Best for: Stable workloads, predictable costs, single location
Federated Computing
- Single Instance (Federated) – Privacy-preserving distributed compute
- GPU Cluster (Federated) – Unlimited scale across global nodes
- Best for: Privacy-sensitive data, global scale, lower costs
Step 7: Launch and Monitor
- Review the cost estimate displayed before launching.
- Click Launch Training.
- Monitor progress in real time:
- Loss curves – Track training and validation loss
- Tokens/sec – Monitor training throughput
- Epoch completion – See current epoch and total epochs
- Queue position – If multiple jobs are queued, see your position
- GPU allocation events – Track resource allocation
- You can pause or cancel if necessary (depending on job state).
Step 8: Review Results
When the run completes:- Inspect evaluation metrics and sample outputs.
- Use the Evaluate feature (LLMTune Evaluate) to test your model:
- Single prompt evaluation – Quick individual tests
- Compare with base model – Side-by-side comparison
- Batch evaluation – Test multiple prompts at once
- Results dashboard – Comprehensive metrics and trends
- Compare with previous runs using the comparison view.
- Archive the run or mark as ready for deployment.
Supported Modalities
FineTune Studio supports training models for all major modalities:- Text-to-Text – LLaMA, Mistral, Qwen, DeepSeek
- Image-to-Text – Qwen-VL, LLaVA, Kimi-K2, PiXtral
- Audio Understanding – Qwen2-Audio, MiniCPM-o
- Audio-to-Text (ASR) – Whisper, NeMo, SpeechT5
- Video-to-Text – InternVL, Kosmos-2, Qwen2-VL-Video
- Code – DeepSeek-Coder, StarCoder2, CodeLLaMA
- Multimodal – Qwen-VL, LLaVA, Kimi-K2
- Text-to-Audio (TTS) – XTTS, Bark, Rime
- Text-to-Embeddings – BGE, E5, text-embedding-3-large
Troubleshooting
- Job stuck in pending: Check resource availability or contact support if it exceeds expected queue times. The training queue processes jobs sequentially to conserve GPU resources.
- Spiking loss: Inspect dataset balance; consider re-weighting or cleaning data. Review dataset quality in Dataset Hub.
- Unexpected outputs: Revisit dataset quality and ensure conversation turns are labeled correctly for your training method.
- Compute selection issues: You can switch between Traditional and Federated compute anytime. If one option isn’t available, try the other.
Next Steps
- Learn about Deployment to promote your trained model
- Use Evaluate to measure model quality
- Explore the Playground to test models interactively
- Check the API documentation for programmatic access