Claw Code Review — Open-Source Claude Code Clone Tested on Real Projects
A developer with 100K GitHub stars in its first week, born from the biggest AI source code leak of 2026. We installed Claw Code, ran it against three codebases, and measured what actually works.
TL;DR — Key Takeaways:
- • Claw Code is a Python/Rust rewrite of Claude Code's agent harness, now at 100K+ GitHub stars
- • It works with any OpenAI-compatible API (GPT-4.1, Claude, Gemini, local models via Ollama)
- • Tool execution and file editing work well; multi-agent orchestration is rough
- • Legal status is murky — clean-room claim, but Anthropic's DMCA campaign is ongoing
- • Verdict: interesting for tinkering and learning, not ready for production workloads
What Is Claw Code and Where Did It Come From?
Claw Code is an open-source AI coding agent framework that replicates the architecture behind Anthropic's Claude Code. It exists because of a packaging mistake: on March 31, 2026, Anthropic accidentally shipped a 59.8 MB source map file inside version 2.1.88 of the @anthropic-ai/claude-code npm package. That file contained roughly 512,000 lines of TypeScript — the entire agent harness.
Within hours, security researcher Chaofan Shou posted about it on X. By the next morning, developer Jin had used the leaked architecture as a reference to build Claw Code from scratch in Python and Rust, using OpenAI's Codex as the orchestration layer. The repository hit 72,000 stars on day one and passed 100,000 by the end of the week.
The project describes itself as a "clean-room reimplementation" — meaning the developers studied the architecture and rebuilt it without directly copying Anthropic's code. Whether that holds up legally is a separate discussion we'll get to.
Installation and First Run
Setup is straightforward if you're comfortable with Python. Clone the repo, install dependencies with pip install -e ., set your API key, and run claw-code from your terminal. The whole process took about 4 minutes on a fresh machine.
One thing we noticed immediately: Claw Code defaults to OpenAI's API endpoint. If you want to use Claude, Gemini, or a local model through Ollama, you need to edit the config file manually. There's no interactive setup wizard like Claude Code has.
The CLI interface looks deliberately similar to Claude Code — same slash commands, similar output formatting. If you've used Claude Code before, you'll feel at home within minutes. That familiarity is both the appeal and the controversy.
Claw Code vs Claude Code — Feature Comparison
Both tools share the same fundamental architecture: a terminal-based agent that reads your codebase, plans changes, edits files, runs commands, and iterates. The differences are in polish and integration depth.
| Feature | Claude Code | Claw Code |
|---|---|---|
| LLM Support | Claude only | Any OpenAI-compatible API |
| Permission System | Granular (19 tools, each sandboxed) | Basic (all-or-nothing) |
| Context Window | Up to 200K tokens (Opus 4.6) | Depends on LLM chosen |
| File Editing | Diff-based with rollback | Diff-based, no rollback yet |
| Multi-Agent | Built-in subagent spawning | Experimental, crashes often |
| Git Integration | Full (commit, PR, diff) | Partial (commit only) |
| MCP Servers | Native support | Community adapters (limited) |
| Price | $20/mo (Pro) or API usage | Free (you pay LLM API costs) |
| Open Source | No (source leaked accidentally) | Yes (Apache 2.0) |
The biggest practical difference: Claw Code's model-agnostic design lets you swap LLMs freely. We ran it with GPT-4.1, Claude 4.6 Sonnet (via API), and a local Qwen3 30B through Ollama. Each worked, though quality varied dramatically. GPT-4.1 handled tool calls most reliably; the local model struggled with complex multi-step edits.
Real-World Testing on Three Projects
Test 1: Bug Fix in a Next.js App (Simple)
We pointed Claw Code at a broken API route returning 500 errors. It read the error logs, identified a missing null check in the Prisma query, and applied the fix in one pass. Total time: 47 seconds. Claude Code does this in about 30 seconds, so roughly comparable for simple tasks.
Test 2: Adding a Feature Across Multiple Files (Medium)
Adding a user preference toggle required touching 5 files (component, API route, database schema, migration, test). Claw Code got through 4 of 5 correctly but missed updating the test file. When we pointed that out, it fixed it in the next iteration. Claude Code handled all 5 in a single pass.
Test 3: Refactoring a Monolith Service (Complex)
This is where the gap showed. We asked both tools to extract a payment processing module from a 2,000-line file into its own service. Claude Code created the new file, updated all imports, adjusted tests, and verified the build — all autonomously. Claw Code got stuck in a loop after creating the new file, repeatedly trying to import a module that didn't exist yet. After three restarts, we got it working, but it took roughly 8x longer.
What Doesn't Work (Yet)
- ✕ Multi-agent orchestration crashes on complex tasks. The subagent spawning logic exists but feels half-baked.
- ✕ No rollback mechanism. If Claw Code makes a bad edit, you're reverting with git yourself.
- ✕ Windows support is flaky. Several file path handling bugs on Windows. Works best on macOS/Linux.
- ✕ No MCP server ecosystem. Claude Code's MCP integration connects it to databases, APIs, and external tools. Claw Code has nothing equivalent.
- ✕ Memory management is primitive. Claude Code compresses conversation context intelligently. Claw Code simply truncates, which means it "forgets" earlier parts of long sessions.
The Legal Question Nobody Can Answer
Claw Code exists in a legal gray zone. The project claims clean-room reimplementation, meaning developers studied the leaked architecture and rebuilt it from scratch without copying code. Anthropic has filed DMCA takedowns against repositories that directly mirror the leaked source, but hasn't targeted Claw Code specifically — at least not publicly.
The precedent here is complicated. Clean-room reverse engineering has legal protection in many jurisdictions (Sega v. Accolade, 1992). But studying source code that was accidentally published — rather than deliberately reverse-engineered — is murkier territory. No court has ruled on this exact scenario.
For individual developers experimenting locally, the risk is probably minimal. For companies considering Claw Code in production, talk to your legal team first. The GitHub repository could vanish overnight if Anthropic decides to escalate.
Should You Use Claw Code?
Use it if: you want to understand how AI coding agents work under the hood, you need LLM flexibility (especially for local models), or you're contributing to open-source agent tooling.
Skip it if: you need a reliable tool for daily work. Claude Code, Cursor, or even OpenAI Codex are more stable for production use.
The 100K-star count reflects developer curiosity more than production readiness. Claw Code is a fascinating artifact of how quickly the open-source community can reconstruct proprietary tooling — but "can rebuild" and "should use in production" remain different conversations.
Our Methodology
We tested Claw Code v0.3.2 (commit 8a3f1c) alongside Claude Code v2.2.1 on April 3, 2026. All tests ran on the same machine (M3 MacBook Pro, 36GB RAM) using GPT-4.1-mini as the default LLM for Claw Code and Claude Sonnet 4.6 for Claude Code. Each test was run three times; we report the median result. The three test projects are private repositories but use standard Next.js + Prisma stacks.
FAQ
Is Claw Code legal to use?
Claw Code claims to be a clean-room reimplementation under Apache 2.0 license. However, its origins in the Claude Code source leak mean legal status remains uncertain. Anthropic has issued DMCA takedowns against direct mirrors but has not targeted Claw Code specifically as of April 2026. Individual experimentation is low-risk; corporate adoption warrants legal review.
Can Claw Code replace Claude Code?
Not yet. Claw Code replicates the agent harness architecture but depends on external LLM APIs. It lacks Claude Code's tight integration with Anthropic's models, the polished permission system, and enterprise features like KAIROS autonomous mode. For simple bug fixes, it's comparable. For complex multi-file refactors, Claude Code is noticeably ahead.
What LLMs does Claw Code support?
Claw Code works with any OpenAI-compatible API endpoint. Users have tested it with GPT-4.1, Claude via API, Gemini, and local models through Ollama. The default configuration uses OpenAI's Codex endpoint. Quality varies significantly by model — GPT-4.1 handles tool calls most reliably in our testing.
How does Claw Code compare to Cursor or Copilot?
Different category. Cursor and Copilot are IDE extensions with inline code completion. Claw Code is a terminal agent that autonomously reads, edits, and runs your entire project. Think of it as a colleague who works independently vs. an autocomplete that helps as you type.
Last Updated: April 4, 2026 • Written by: Jim Liu, web developer based in Sydney who has tested 40+ AI coding tools since 2024.