← Stu Mason
live · open source
MCP server engineering

The Coolify API speaks in megabytes.
Agents think in tokens.

coolify-mcp is the layer that translates between them, and it does it with 90 to 99% less noise. Here is exactly how.

raw · list_services · 367 KB
{
  "id": 142,
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "description": "Marketing site + blog",
  "fqdn": "https://stuartmason.co.uk,https://www.stuartmason.co.uk",
  "config_hash": "a1b2c3d4e5f6...",
  "git_repository": "StuMason/stuartmason",
  "git_branch": "main",
  "git_commit_sha": "HEAD",
  "build_pack": "nixpacks",
  "static_image": "nginx:alpine",
  "install_command": "npm ci",
  "build_command": "npm run build",
  "start_command": null,
  "ports_exposes": "3000",
  "ports_mappings": null,
  "base_directory": "/",
  "publish_directory": "/",
  "health_check_enabled": true,
  "health_check_path": "/up",
  "health_check_port": null,
  "health_check_host": null,
  "health_check_method": "GET",
  "health_check_return_code": 200,
  "health_check_scheme": "http",
  "health_check_response_text": null,
  "health_check_interval": 30,
  "health_check_timeout": 30,
  "limits_memory": "0",
  "limits_memory_swap": "0",
  "limits_cpus": "0",
  "dockerfile": "FROM php:8.4-fpm\nRUN docker-php-ext-install ... (47KB) ...",
  "docker_compose_raw": "services:\n  app:\n    build: . ... (3KB) ...",
  "server": { "id": 1, "name": "apps", "ip": "138.199.216.202", "... 38 more fields": "..." },
  "destination": { "id": 1, "network": "coolify", "... 12 more fields": "..." },
  "... 53 more fields": "..."
}
0%
smaller
signal · 1.2 KB
{
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "status": "running:healthy",
  "fqdn": "https://stuartmason.co.uk",
  "git": "StuMason/stuartmason@main",
  "_actions": { "restart": "control(restart)", "logs": "application_logs(uuid)" }
}
7.9k
npm installs / mo
445
GitHub stars
62
forks
42
tools
98%
test coverage
Watch me walk through it

A two-minute walkthrough of the token-collapse problem and how coolify-mcp solves it, in the actual codebase.

The hard part

Two problems, solved together.

Wiring an API to an agent is easy. Keeping it inside a finite, expensive context window is the part that takes judgement.

Context budgeting

A single app listing is 91 fields, with a 47KB Dockerfile and a 3KB compose string buried inside. Sent raw, it floods the model. A two-tier projection layer returns a 5-field summary from list calls and full detail only on request. A 1MB deployment list becomes 4KB.

Tool-list cost

Every tool's schema ships to the model on every turn. Sixty granular CRUD tools cost roughly 43,000 tokens before the user says a word. Consolidating to 42 action-parameter tools cut that to 6,600. An 85% saving, every turn.

Measured, not claimed

Real payloads, real reductions.

Three of the heaviest endpoints, before and after the projection layer.

list_services367 KB 1.2 KB
99.7% smaller
list_applications170 KB 4.4 KB
97.4% smaller
deployments_for_app1.0 MB 4.0 KB
99.6% smaller
{
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "status": "running:healthy",
  "fqdn": "https://stuartmason.co.uk",
  "git": "StuMason/stuartmason@main",
  "_actions": { "restart": "control(restart)", "logs": "application_logs(uuid)" }
}

What the model receives from a list call. Five fields chosen from ninety-one.

The signal path

One question, eight moves.

  1. 01Client

    Claude or Cursor spawns the server locally. Your Coolify token is passed as an env var and never leaves your machine.

  2. 02Handshake

    One tools/list goes to the model: 42 consolidated descriptors at ~6.6k tokens, not 60+ granular ones at ~43k.

  3. 03Intent

    You ask in plain English. The model picks a tool and emits one structured call.

  4. 04Route

    Args are validated with Zod and routed by action enum. This layer holds no HTTP knowledge.

  5. 05Fan-out

    Composite tools (diagnose_app, find_issues) fire 4 to 8 Coolify calls in parallel via Promise.allSettled.

  6. 06Project

    Responses pass a projection layer: lists collapse to 5 fields, logs paginate and truncate, secrets mask to ***.

  7. 07Hint

    Each payload carries HATEOAS _actions and pagination cursors, so the model knows the next valid move for free.

  8. 08Answer

    A 90 to 99% smaller payload reaches the model. Docs questions resolve against a local BM25 index, no extra network call.

The move that paid for itself

Six tools became one.

mcp-server.ts
// Six CRUD tools became one. Their schemas no longer
// ship to the model on every turn.
server.tool("application", {
  action: z.enum([
    "list", "get", "create",
    "start", "stop", "restart",
  ]),
  uuid: z.string().optional(),
  // ...
}, async ({ action, uuid }) => {
  switch (action) {
    case "list":    return summarise(await client.apps());
    case "restart": return client.restart(resolve(uuid));
    // ...
  }
});
~36k

tokens saved on every single turn by collapsing six tools into one action-parameter tool. Across a long agent session, that is the difference between a fast, accurate assistant and one that forgets what it was doing.

Built to be trusted
362 tests, ~98% coverage

TypeScript strict, Zod-validated at every boundary.

Official MCP Registry

Published as io.github.StuMason/coolify, plus a security assessment badge.

Zero runtime dependencies

Beyond the MCP SDK, Zod and a local search index. No daemon, no database.

Secrets masked by default

Env values return as *** unless reveal:true is set explicitly.

What this means for your bench

Your clients are asking for AI. This is the person who ships it, badged as yours.

coolify-mcp is one of several production AI systems I have built and shipped. If your agency has AI work it can't staff, that is exactly where I slot in.