Skip to main content

Agent overview

The Agent is a coding assistant exposed via an OpenAI-compatible API. It supports tools (e.g. read file, write file, list directory, run terminal, search and replace). The API returns tool calls; your client (e.g. IDE or CLI) executes them and sends the results back. Execution is client-side—the platform does not run code on your machine.

Capabilities

  • Chat — Multi-turn conversation with a supported model.
  • Tools — The model can request tool use. Supported tools typically include:
    • read_file — Read file contents (path parameter).
    • write_file — Write contents to a file.
    • list_directory — List directory contents.
    • run_terminal — Run a shell command.
    • search_replace — Search and replace in a file.
Tool names and parameters follow the schema you send in the request; the response includes tool_calls with the function name and arguments (e.g. JSON string).

How it works

  1. You send a request with messages and optionally a tools array (OpenAI-style tool definitions).
  2. The model may respond with text and/or tool_calls.
  3. Your client executes each tool call (e.g. read a file, run a command).
  4. You append the tool results as new messages and call the API again.
  5. Repeat until the model returns a final answer without tool calls.
The Agent does not execute tools on the server. It only returns tool calls; execution and file access happen in your environment.

Base URL

Use the production API base:
https://api.llmtune.io/api/agent/v1
Example: list models for the agent:
GET https://api.llmtune.io/api/agent/v1/models
Chat:
POST https://api.llmtune.io/api/agent/v1/chat/completions

Authentication

Same as the rest of the API: include your API key in the request:
Authorization: Bearer sk_live_YOUR_KEY
Balance is checked and usage is deducted per request the same way as for inference.

Model support

Only models that are enabled for inference and listed for the Agent can be used. Call GET .../api/agent/v1/models to see which models are available. Use one of those model IDs in the model field of chat requests.

Limitations

  • Client-side execution — You must implement tool execution in your client. The API does not run commands or read your files.
  • Model list — Only models returned by the Agent models endpoint are supported; others may return 400 or 404.
  • Rate limits and billing — Same rate limits and balance rules as inference; 402 when balance is insufficient.
For request/response examples, see Agent API examples.