Pricing & Billing

This guide explains LLMTune’s pricing model, billing process, and how to optimize your spending.

Pricing Overview
Inference Pricing
Fine-Tuning Pricing
Plans & Tiers
Billing & Payments
Cost Optimization
Usage Monitoring
FAQ

Pricing Overview

LLMTune offers simple, transparent pricing:

Pay-as-you-go: Only pay for what you use
No hidden fees: Clear pricing for all services
Volume discounts: Automatic discounts for high usage
Predictable costs: Set budget limits and alerts

Pricing Components

Inference: Charged per input/output token
Fine-tuning: Charged per GPU hour
Storage: Free for datasets and model artifacts
API requests: Included in token pricing

Cost Comparison

Service	LLMTune	OpenAI	HuggingFace
GPT-4 class inference	/1M tokens	/1M tokens	/1M tokens
GPT-3.5 class inference	.50/1M tokens	/1M tokens	.80/1M tokens
Fine-tuning	/GPU hour	Custom	/GPU hour

LLMTune offers 30-50% cost savings compared to major providers.

Inference Pricing

Inference is charged per token processed (both input and output).

Token Pricing

Model Tier	Price per 1M Tokens
Small (7B parameters)	.50
Medium (13B-34B parameters)	.00
Large (70B+ parameters)	.00

Billing Model

Input tokens: Charged at full rate
Output tokens: Charged at full rate
Minimum charge: 1 token per request
Rounding: Tokens are counted exactly (no rounding up)

Example Calculations

Example 1: Simple request

Input: 100 tokens
Output: 200 tokens
Total: 300 tokens
Cost (70B model): 300 / 1,000,000 * \ = .003

Example 2: Batch processing

10 requests, 1,000 tokens each
Total: 10,000 tokens
Cost (70B model): 10,000 / 1,000,000 * \ = .10

Specific Model Pricing

Model	Price per 1M Tokens	Notes
Llama 3.3 70B	.00	Premium performance
Mistral 7B	.50	Fast, cost-effective
Qwen2.5 72B	.00	Excellent value
DeepSeek R1	.00	Strong reasoning

Volume Discounts

Automatic discounts apply at these monthly thresholds:

Monthly Usage	Discount
1M+ tokens	5%
10M+ tokens	10%
100M+ tokens	20%
1B+ tokens	30%

Discounts are applied automatically at the end of each billing cycle.

Fine-Tuning Pricing

Fine-tuning is charged per GPU hour, with multipliers based on the training method.

GPU Hour Pricing

Compute Type	Price per GPU Hour
Traditional - Single Instance	.00
Traditional - GPU Cluster	.50
Federated - Single Instance	.50
Federated - GPU Cluster	.00

Training Method Multipliers

Different training methods have different compute requirements:

Method	GPU Hour Multiplier	Notes
SFT	1×	Baseline
DPO	1.5×	Requires reward model
PPO	2×	Most compute-intensive
RLAIF	1.8×	AI feedback loop
CTO	1.2×	Controlled tuning
LoRA	0.5×	Parameter-efficient
QLoRA	0.3×	Most efficient

Cost Estimation

Before launching training, LLMTune provides:

GPU hour estimate: Based on model size and dataset
Cost estimate: Based on compute type and training method
Time estimate: Based on current queue and compute availability

You can adjust parameters to see cost impacts before launching.

Example Calculations

Example 1: SFT with LoRA

Model: Llama 3.3 70B
Dataset: 100K examples
Method: SFT with LoRA (0.5× multiplier)
Compute: Traditional Single Instance (/hour)
Estimated GPU hours: 2 hours
Cost: 2 * 2 * 0.5 = .00

Example 2: PPO full fine-tune

Model: Mistral 7B
Dataset: 50K examples
Method: PPO (2× multiplier)
Compute: Traditional GPU Cluster (.50/hour)
Estimated GPU hours: 1 hour
Cost: 1 * 2.50 * 2 = .00

Training Queue

Jobs are processed sequentially to conserve GPU resources:

Queue position shown before launch
Estimated wait time provided
No charge for time spent in queue
Billing starts when GPU allocation begins

Getting started

Setup

Core concepts

How-to guides

Pricing & Billing

Pricing & Billing

Table of Contents

Pricing Overview

Pricing Components

Cost Comparison

LLMTune offers 30-50% cost savings compared to major providers.

Inference Pricing

Token Pricing

Billing Model

Example Calculations

Specific Model Pricing

Volume Discounts

Discounts are applied automatically at the end of each billing cycle.

Fine-Tuning Pricing

GPU Hour Pricing

Training Method Multipliers

Cost Estimation

Example Calculations

Training Queue

Getting started

Setup

Core concepts

How-to guides

Documentation Index

​Pricing & Billing

​Table of Contents

​Pricing Overview

​Pricing Components

​Cost Comparison

​LLMTune offers 30-50% cost savings compared to major providers.

​Inference Pricing

​Token Pricing

​Billing Model

​Example Calculations

​Specific Model Pricing

​Volume Discounts

​Discounts are applied automatically at the end of each billing cycle.

​Fine-Tuning Pricing

​GPU Hour Pricing

​Training Method Multipliers

​Cost Estimation

​Example Calculations

​Training Queue

Pricing & Billing

Table of Contents

Pricing Overview

Pricing Components

Cost Comparison

LLMTune offers 30-50% cost savings compared to major providers.

Inference Pricing

Token Pricing

Billing Model

Example Calculations

Specific Model Pricing

Volume Discounts

Discounts are applied automatically at the end of each billing cycle.

Fine-Tuning Pricing

GPU Hour Pricing

Training Method Multipliers

Cost Estimation

Example Calculations

Training Queue