live · open source

MCP server engineering

The Coolify API speaks in megabytes.
Agents think in tokens.

coolify-mcp is the layer that translates between them, and it does it with 90 to 99% less noise. Here is exactly how.

raw · list_services · 367 KB

{
  "id": 142,
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "description": "Marketing site + blog",
  "fqdn": "https://stuartmason.co.uk,https://www.stuartmason.co.uk",
  "config_hash": "a1b2c3d4e5f6...",
  "git_repository": "StuMason/stuartmason",
  "git_branch": "main",
  "git_commit_sha": "HEAD",
  "build_pack": "nixpacks",
  "static_image": "nginx:alpine",
  "install_command": "npm ci",
  "build_command": "npm run build",
  "start_command": null,
  "ports_exposes": "3000",
  "ports_mappings": null,
  "base_directory": "/",
  "publish_directory": "/",
  "health_check_enabled": true,
  "health_check_path": "/up",
  "health_check_port": null,
  "health_check_host": null,
  "health_check_method": "GET",
  "health_check_return_code": 200,
  "health_check_scheme": "http",
  "health_check_response_text": null,
  "health_check_interval": 30,
  "health_check_timeout": 30,
  "limits_memory": "0",
  "limits_memory_swap": "0",
  "limits_cpus": "0",
  "dockerfile": "FROM php:8.4-fpm\nRUN docker-php-ext-install ... (47KB) ...",
  "docker_compose_raw": "services:\n  app:\n    build: . ... (3KB) ...",
  "server": { "id": 1, "name": "apps", "ip": "138.199.216.202", "... 38 more fields": "..." },
  "destination": { "id": 1, "network": "coolify", "... 12 more fields": "..." },
  "... 53 more fields": "..."
}

↓

smaller

signal · 1.2 KB

{
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "status": "running:healthy",
  "fqdn": "https://stuartmason.co.uk",
  "git": "StuMason/stuartmason@main",
  "_actions": { "restart": "control(restart)", "logs": "application_logs(uuid)" }
}

7.9k

npm installs / mo

445

GitHub stars

forks

tools

98%

test coverage

Watch me walk through it

A two-minute walkthrough of the token-collapse problem and how coolify-mcp solves it, in the actual codebase.

The hard part

Two problems, solved together.

Wiring an API to an agent is easy. Keeping it inside a finite, expensive context window is the part that takes judgement.

Context budgeting

A single app listing is 91 fields, with a 47KB Dockerfile and a 3KB compose string buried inside. Sent raw, it floods the model. A two-tier projection layer returns a 5-field summary from list calls and full detail only on request. A 1MB deployment list becomes 4KB.

Tool-list cost

Every tool's schema ships to the model on every turn. Sixty granular CRUD tools cost roughly 43,000 tokens before the user says a word. Consolidating to 42 action-parameter tools cut that to 6,600. An 85% saving, every turn.

Measured, not claimed

Real payloads, real reductions.

Three of the heaviest endpoints, before and after the projection layer.

list_services367 KB → 1.2 KB

99.7% smaller

list_applications170 KB → 4.4 KB

97.4% smaller

deployments_for_app1.0 MB → 4.0 KB

99.6% smaller

{
  "uuid": "g4sk4ckcw080osckos48sswo",
  "name": "stuartmason.co.uk",
  "status": "running:healthy",
  "fqdn": "https://stuartmason.co.uk",
  "git": "StuMason/stuartmason@main",
  "_actions": { "restart": "control(restart)", "logs": "application_logs(uuid)" }
}

What the model receives from a list call. Five fields chosen from ninety-one.

The signal path

One question, eight moves.

01Client
Claude or Cursor spawns the server locally. Your Coolify token is passed as an env var and never leaves your machine.
02Handshake
One tools/list goes to the model: 42 consolidated descriptors at ~6.6k tokens, not 60+ granular ones at ~43k.
03Intent
You ask in plain English. The model picks a tool and emits one structured call.
04Route
Args are validated with Zod and routed by action enum. This layer holds no HTTP knowledge.
05Fan-out
Composite tools (diagnose_app, find_issues) fire 4 to 8 Coolify calls in parallel via Promise.allSettled.
06Project
Responses pass a projection layer: lists collapse to 5 fields, logs paginate and truncate, secrets mask to ***.
07Hint
Each payload carries HATEOAS _actions and pagination cursors, so the model knows the next valid move for free.
08Answer
A 90 to 99% smaller payload reaches the model. Docs questions resolve against a local BM25 index, no extra network call.

The move that paid for itself

Six tools became one.

mcp-server.ts

// Six CRUD tools became one. Their schemas no longer
// ship to the model on every turn.
server.tool("application", {
  action: z.enum([
    "list", "get", "create",
    "start", "stop", "restart",
  ]),
  uuid: z.string().optional(),
  // ...
}, async ({ action, uuid }) => {
  switch (action) {
    case "list":    return summarise(await client.apps());
    case "restart": return client.restart(resolve(uuid));
    // ...
  }
});

~36k

tokens saved on every single turn by collapsing six tools into one action-parameter tool. Across a long agent session, that is the difference between a fast, accurate assistant and one that forgets what it was doing.

Built to be trusted

362 tests, ~98% coverage

TypeScript strict, Zod-validated at every boundary.

Official MCP Registry

Published as io.github.StuMason/coolify, plus a security assessment badge.

Zero runtime dependencies

Beyond the MCP SDK, Zod and a local search index. No daemon, no database.

Secrets masked by default

Env values return as *** unless reveal:true is set explicitly.

More proof

Different problem, same engineering.

RAG over private dataRAG that cites or refusesHybrid retrieval, grounded citations, and a guard that refuses instead of guessing.See the teardown →Agentic commerceA shop agents buy fromMCP discovery, ACP checkout and x402 on-chain settlement. A live demo.See the teardown →

Selected workcase studies →Reviewsreal, named →

What this means for your bench

Your clients are asking for AI. This is the person who ships it, badged as yours.

coolify-mcp is one of several production AI systems I have built and shipped. If your agency has AI work it can't staff, that is exactly where I slot in.

Book a call Read the source npm

The Coolify API speaks in megabytes.Agents think in tokens.