Skip to main content

Playground

The LLMTune Playground lets you compare models side-by-side, test prompts interactively, and share results. It’s perfect for quick model evaluation, prompt engineering, and comparing different models before fine-tuning.

Features

  • Model selection – Choose one or multiple models from the catalog, including deployed fine-tunes
  • Multi-modal support – Attach images for vision-capable models (Qwen-VL, LLaVA, etc.)
  • Side-by-side comparison – See responses from multiple models simultaneously with latency data
  • Temperature controls – Adjust sampling parameters (temperature, max tokens) per session
  • Conversation history – Maintain context across multiple turns
  • Export results – Copy conversation logs to share with teammates or re-run through the API

Accessing the Playground

  1. Navigate to Playground from the main dashboard or visit /playground.
  2. You’ll see the playground interface with model selector, message composer, and settings.

Workflow

Step 1: Select Models

  1. Click Select Models to open the model selector.
  2. Choose one or multiple models to compare:
    • Browse by provider (IO Intelligence, DeepSeek, Mistral, etc.)
    • Filter by capabilities (text, vision, audio, code, etc.)
    • Search by model name
  3. Selected models appear in the header with a count badge.

Step 2: Configure Settings (Optional)

  1. Click the settings icon to adjust:
    • Temperature (0.0-2.0) – Controls randomness (default: 0.7)
    • Max Tokens – Maximum response length (default: 1000)
  2. Settings apply to all models in the comparison.

Step 3: Enter Your Prompt

  1. Type your prompt in the message composer.
  2. For vision-capable models, you can attach images (see below).
  3. Press Enter or click Send to submit.

Step 4: Review Responses

  1. Responses appear side-by-side for each selected model.
  2. Each response shows:
    • Model name and provider
    • Generated text (with syntax highlighting for code)
    • Latency (time to first token and total time)
    • Token count
  3. Compare outputs to see which model performs best for your use case.

Step 5: Continue Conversation

  1. The playground maintains conversation history.
  2. Send follow-up messages to test multi-turn capabilities.
  3. Use Clear to reset the conversation and start fresh.

Image Attachments

For vision-capable models (Qwen-VL, LLaVA, Kimi-K2, PiXtral, etc.):

Supported Formats

  • PNG, JPEG, WebP, GIF
  • Maximum file size: 20MB per image
  • Maximum attachments: 10 images per message

Adding Images

  1. Drag and drop – Drag image files directly into the composer
  2. Paste from clipboard – Copy an image and paste (Ctrl+V / Cmd+V)
  3. Click to upload – Click the attachment area to browse files

Removing Images

  • Click the × button on any attached image to remove it
  • Clear all attachments by resetting the message composer

Vision Model Detection

The playground automatically detects if a selected model supports vision. If you attach images to a non-vision model, you’ll see a warning message.

Use Cases

Model Comparison

Compare different base models side-by-side to find the best fit for your task before fine-tuning.

Prompt Engineering

Test different prompt formats and styles to optimize for your use case.

Quick Evaluation

Quickly test model outputs without setting up API integrations.

Fine-Tuned Model Testing

Test your fine-tuned models alongside base models to measure improvement.

Multimodal Testing

Test vision-language models with images to evaluate visual understanding.

Tips

  1. Start with one model – Test a single model first, then add more for comparison
  2. Use clear prompts – Well-formatted prompts produce better results
  3. Compare systematically – Test the same prompt across multiple models for fair comparison
  4. Check latency – Use latency data to choose models that meet your performance requirements
  5. Export results – Copy conversation logs to document your findings

Limitations

  • Playground sessions are temporary and not saved automatically
  • Rate limits apply based on your plan
  • Large models may have slower response times
  • Some models may not be available in all regions

Next Steps

  • Use the Playground to select models for Fine-Tuning
  • Test your fine-tuned models in the Playground after Deployment
  • Use the Inference API to integrate playground-style testing into your applications
  • Learn about Evaluate for more comprehensive model testing