CLI to MCP: Wrap Any Command-Line Tool as a Server

Q: How do I debug an MCP server when something goes wrong?

The MCP Inspector first. Then `mcp-cli` or running the server with verbose logging to stderr. If the issue is at the JSON-RPC layer, you can tee stdin/stdout to a file by wrapping the server in a small relay script — useful for catching framing bugs. Most issues I hit are either stdio buffering or a schema mismatch the Inspector catches in five seconds.

Last Updated: 2026-05-26

I spent a Saturday converting a small CLI I'd written years ago into an MCP server. By Sunday afternoon Claude Code was calling it like any native tool. Three hours of work to skip the shell-out tax forever.

This guide is the version I wish I'd had when I started — what MCP actually is, the smallest TypeScript and Python boilerplate that works, a real jq wrap example, and the pitfalls that ate my morning.

TL;DR

MCP (Model Context Protocol) is Anthropic's open standard for letting AI agents talk to external tools over a structured JSON-RPC channel. Spec docs live at modelcontextprotocol.io.
Converting a CLI to an MCP server means your agent calls mcp.tools.call("jq_query", { filter: ".foo" }) instead of shelling out blind. It gets typed inputs, structured errors, and discoverability.
Minimum viable server in TypeScript or Python is around 40-60 lines. You can wrap an existing command-line tool in an afternoon.
The big wins: schema-validated arguments, no quoting hell, better error visibility for the model, and the agent can list available tools without you teaching it.
The big traps: stdio buffering, JSON-RPC framing, and forgetting that stderr is reserved for logs — print to it and you'll corrupt the protocol stream.

What MCP Is and Why Convert CLI Tools

MCP is a JSON-RPC protocol that runs over stdio (or HTTP/SSE for remote servers). An agent like Claude Code or Cursor speaks to an MCP server using tools/list and tools/call requests. The server replies with structured JSON. That's it.

You can read the official MCP spec for the wire format, but the short version: the agent asks "what tools do you have," gets a JSON schema for each one, and then makes typed calls. Compare that to giving an agent raw shell access where it has to guess flags, escape strings, and parse free-form stdout.

A CLI is procedural. You pass argv, you read stdout, you check the exit code. That's fine for humans. For agents it's lossy. (If you want background on how agent tool-calling has evolved from raw function calls to typed protocols like MCP, my function calling primer walks through the lineage.) The agent has to know your flag conventions, your output format, and your error idioms. Multiply that across ten tools and you're burning context window on shell quirks.

Wrapping the same CLI as an MCP server fixes all of that. The agent gets a typed contract.

Minimum Viable MCP Server in TypeScript

Here's the smallest TypeScript MCP server I could write that does something useful. It exposes a single tool that runs git status --porcelain and returns parsed output.

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import { execFile } from "node:child_process";
import { promisify } from "node:util";

const execFileAsync = promisify(execFile);

const server = new Server(
  { name: "git-mcp", version: "0.1.0" },
  { capabilities: { tools: {} } },
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: "git_status",
      description: "Return the working tree status in porcelain format.",
      inputSchema: {
        type: "object",
        properties: {
          cwd: { type: "string", description: "Repo directory" },
        },
        required: ["cwd"],
      },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (req) => {
  if (req.params.name !== "git_status") {
    throw new Error(`Unknown tool: ${req.params.name}`);
  }
  const cwd = String(req.params.arguments?.cwd ?? "");
  const { stdout } = await execFileAsync("git", ["status", "--porcelain"], { cwd });
  return {
    content: [{ type: "text", text: stdout || "(clean working tree)" }],
  };
});

const transport = new StdioServerTransport();
await server.connect(transport);

Save it as server.ts, install @modelcontextprotocol/sdk, compile with tsc, and you have a working server. Point Claude Code's config at the resulting node dist/server.js and it'll show up as a callable tool.

Note the execFile over exec — never use exec for an MCP tool wrapper. It spawns a shell and you've now reintroduced quoting bugs. execFile takes argv as an array, which is what you want.

Minimum Viable MCP Server in Python

Same idea in Python using the official SDK.

import asyncio
import subprocess
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent

app = Server("git-mcp")

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="git_status",
            description="Return git working tree status in porcelain format.",
            inputSchema={
                "type": "object",
                "properties": {
                    "cwd": {"type": "string", "description": "Repo directory"},
                },
                "required": ["cwd"],
            },
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name != "git_status":
        raise ValueError(f"Unknown tool: {name}")
    cwd = arguments.get("cwd", ".")
    result = subprocess.run(
        ["git", "status", "--porcelain"],
        cwd=cwd,
        capture_output=True,
        text=True,
        check=False,
    )
    if result.returncode != 0:
        return [TextContent(type="text", text=f"git error: {result.stderr.strip()}")]
    return [TextContent(type="text", text=result.stdout or "(clean working tree)")]

async def main():
    async with stdio_server() as (read, write):
        await app.run(read, write, app.create_initialization_options())

if __name__ == "__main__":
    asyncio.run(main())

Install with pip install mcp. Run with python server.py. Same protocol, different language. I tend to reach for Python when the CLI I'm wrapping is part of a Python-heavy toolchain (think kubectl, aws, gh) and TypeScript when it's a Node-native tool.

Wrapping a Real CLI: The jq Example

I'll walk through the actual server I built last weekend for jq. The goal: let the agent query JSON files without me re-explaining jq filter syntax every time.

import asyncio
import subprocess
import json
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent

app = Server("jq-mcp")

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="jq_query",
            description=(
                "Run a jq filter against JSON input. Returns parsed result. "
                "Use this when you need to extract or transform data from a JSON "
                "file or string. Filter syntax follows standard jq."
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "filter": {
                        "type": "string",
                        "description": "jq filter expression, e.g. '.users[].email'",
                    },
                    "input_json": {
                        "type": "string",
                        "description": "JSON string to query. Use this OR file_path.",
                    },
                    "file_path": {
                        "type": "string",
                        "description": "Path to JSON file. Use this OR input_json.",
                    },
                    "raw_output": {
                        "type": "boolean",
                        "description": "If true, output raw strings without quotes.",
                        "default": False,
                    },
                },
                "required": ["filter"],
            },
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name != "jq_query":
        raise ValueError(f"Unknown tool: {name}")

    filter_expr = arguments["filter"]
    raw_output = arguments.get("raw_output", False)
    cmd = ["jq"]
    if raw_output:
        cmd.append("-r")
    cmd.append(filter_expr)

    stdin_data = None
    if "file_path" in arguments:
        cmd.append(arguments["file_path"])
    elif "input_json" in arguments:
        stdin_data = arguments["input_json"]
    else:
        return [TextContent(type="text", text="error: provide input_json or file_path")]

    try:
        result = subprocess.run(
            cmd, input=stdin_data, capture_output=True, text=True,
            timeout=10, check=False,
        )
    except subprocess.TimeoutExpired:
        return [TextContent(type="text", text="error: jq timed out after 10s")]

    if result.returncode != 0:
        return [TextContent(type="text", text=f"jq error: {result.stderr.strip()}")]
    return [TextContent(type="text", text=result.stdout.rstrip())]

async def main():
    async with stdio_server() as (read, write):
        await app.run(read, write, app.create_initialization_options())

if __name__ == "__main__":
    asyncio.run(main())

Once this is hooked into Claude Code, I ask things like "pull every email from users.json where role is admin" and the agent crafts the filter, calls jq_query, and shows me results. No more pasting jq man pages into context.

A few things worth pointing out in this snippet. I added a 10-second timeout because a runaway jq filter on a large file will hang the whole MCP session otherwise. I gave the tool a description that nudges the model toward correct usage. I made filter required but left input mutually exclusive — file or string, not both. That kind of schema design saves you from edge cases the model would otherwise hit.

Testing With the MCP Inspector

The MCP Inspector is a web UI that connects to your server and lets you fire requests by hand. This is how I caught half the bugs in my jq wrapper before letting Claude near it.

Install and run:

npx @modelcontextprotocol/inspector python server.py

It launches on http://localhost:5173. You'll see a left panel listing your tools (this is the tools/list response). Click one, fill in arguments in the form, hit Call. The response panel shows what Claude would see.

What I check every time:

Does tools/list return all my tools with full schemas?
Does an obviously-bad input return a useful error string, not a crash?
Does a timeout case actually time out, or does the server hang?
Does a successful call return text content, not raw bytes?

If all four pass in the Inspector, the integration with Claude Code or Cursor almost always works on the first try.

Common Pitfalls That Burned Me

Stdio buffering. This one cost me an hour. Python's sys.stdout is line-buffered by default in a terminal but block-buffered when stdout is a pipe — which it is when an MCP client launches your server. If you print() anything that isn't a properly framed JSON-RPC message, you'll either corrupt the stream or deadlock waiting for a flush. Solution: never write to stdout yourself. Let the SDK handle the protocol. If you need to log, write to stderr (print(..., file=sys.stderr)), and even then keep it minimal — some clients capture stderr and surface it as a tool error.

JSON-RPC framing. The SDK abstracts this, but it's worth knowing: messages are length-prefixed or newline-delimited depending on transport. If you accidentally write a stray \n to stdout, you've split one message into two malformed ones. Same root cause as the stdio buffering issue, different symptom.

Error visibility. Returning { "error": "something" } as text content is wrong. The agent reads it as a string and may not understand it failed. Better: raise a proper error from your handler so the SDK marks the response as an error. The Python SDK does this automatically when you raise from @app.call_tool(). The TypeScript SDK does it when you throw.

Shell injection. I almost made this mistake on the git server. If you accept a cwd argument and pass it to exec instead of execFile, an attacker (or a confused agent) can inject && rm -rf into the path. Always use array-form argv (execFile in Node, subprocess.run(..., shell=False) in Python — which is the default).

Tool description hygiene. The description field is what the agent uses to decide whether to call your tool. "Run jq" is useless. "Run a jq filter against JSON input. Use this when you need to extract or transform data from JSON" is the version the model actually picks up correctly. Treat descriptions like prompt engineering, because that's exactly what they are.

Comparing Approaches: Shell Out vs Wrapper vs Native vs Existing MCP

If you're deciding whether to bother converting your CLI, here's the trade-off matrix I'd consult.

Approach	Setup complexity	Agent UX	Error visibility	Security
Raw shell-out (agent runs CLI directly)	Zero	Bad — agent guesses flags, parses stdout	Buried in stderr	Risky — shell injection if not careful
Lightweight CLI wrapper script	Low — bash or python wrapper	Okay — predictable args, same parse problem	Still text-based	Better if you validate input
Native MCP server (custom)	Medium — afternoon of work	Best — typed schema, listable tools	Structured errors	Strong if you stick to execFile/argv
Existing community MCP server	Zero if one exists	Best	Depends on author	Audit the source

My rule: if the CLI is something I touch daily and the agent calls it more than a few times per session, build the MCP server. If it's a once-a-month tool, shell-out is fine. (For context on what an agent actually does with these calls under the hood, see my AI agent architecture overview.)

Where This Fits in the Broader Agent Stack

MCP is one piece of how I run a small AI dev setup. The other pieces matter too. If you're new to Claude Code and trying to figure out which agentic editor to commit to, my Claude Code vs Cursor comparison lays out the tradeoffs I hit.

For the underlying model layer, you can mix and match — I wrote up the DeepSeek vs GPT comparison when I was deciding which reasoning backend to use for my own agents. And if you're trying to keep AI dev costs sane, my DeepClaude review covers a setup that runs Claude Code workflows at roughly 17x less cost.

Wrapping CLIs as MCP servers compounds those savings. A typed tool call is shorter than the agent-explores-the-shell dance — fewer tokens spent per invocation, faster results, less hallucinated syntax.

What I'd Do Differently Next Time

I built three MCP servers in the past month. Looking back, the order I should have done them in was: simplest first, most-used second, most-complex last.

I started with a complex one (a wrapper around a custom internal CLI with sixteen subcommands) and burned three hours getting the schema right. The right move would have been a one-tool server like the git example above, just to internalize the protocol. Then build the big one.

Also: I underestimated how much of the work is description writing. The code is short. The descriptions that make the agent actually use the tool correctly are the slow part. Budget time for that.

FAQ

What's the difference between a CLI and an MCP server?

A CLI is a program you run from the shell with argv and read from stdout. An MCP server is a long-running process that speaks JSON-RPC over stdio (or HTTP) and exposes typed tools to an AI agent. You can wrap a CLI inside an MCP server — that's literally what this guide is about. The CLI does the work; MCP gives the agent a typed contract to call it.

Why not just let the agent shell out directly?

You can. Claude Code and similar agents do it constantly. The problem is the agent has to guess flag syntax, escape quotes, parse free-form output, and deal with stderr noise. An MCP wrapper gives it a typed schema, structured errors, and a description that tells it when to call. Less context burned on shell quirks, more on actually solving your problem.

Can I use existing MCP servers instead of building my own?

Yes, and you should check first. There's a growing list of community servers for common tools — filesystem, git, search engines, databases. The official MCP servers repo is the place to start. If something close to what you need already exists, use it and contribute back. Build your own when the tool is custom or the existing server is missing the features you actually need.

Do I need to use TypeScript or Python?

No. The protocol is language-agnostic. The official SDKs are TypeScript and Python, but you can implement the JSON-RPC handshake in anything that can read stdin and write stdout. I'd use Go or Rust if I needed to ship a server as a single binary. For most cases the SDKs save enough boilerplate that the language choice comes down to whatever stack you're already in.

How do I debug an MCP server when something goes wrong?

The MCP Inspector first. Then mcp-cli or running the server with verbose logging to stderr. If the issue is at the JSON-RPC layer, you can tee stdin/stdout to a file by wrapping the server in a small relay script — useful for catching framing bugs. Most issues I hit are either stdio buffering or a schema mismatch the Inspector catches in five seconds.

Can MCP servers run remotely?

Yes — there's an HTTP+SSE transport for that. Most local dev servers I write use stdio because the client (Claude Code, Cursor) spawns the server as a subprocess. But for shared team tools, remote MCP makes sense. The protocol is the same, only the transport differs.

Is there a security model for MCP servers?

The current model is "you launched it, you trust it." Servers run with the same permissions as your editor. Treat an MCP server like installing a CLI tool — read the code, prefer well-known sources, and don't give an experimental server access to things you wouldn't give a random shell script. For team use, I keep MCP servers in their own vetted repo and review PRs the same way I review CI changes.

How do I know if a CLI is worth converting?

Three questions. Do I run this tool more than once a week? Does the agent currently get its flag syntax wrong? Would typed inputs make my workflow noticeably faster? If two of three are yes, build the MCP server. If only one is yes, you're better off teaching the agent a snippet about the CLI and saving the afternoon.

About the Author

Jim Liu is a Sydney-based indie developer who builds small SaaS tools solo and reviews AI developer tooling on OpenAIToolsHub. He's been shipping side projects since 2023 and uses MCP-wrapped tools in his daily Claude Code workflow.

Will update this if anything in the SDK API changes meaningfully.