Claude Code Workflow Examples (2026): Plugins, Memory, Hooks I Run Across 12 Repos

TL;DR

What works: 6 plugin combinations from my actual .claude/skills/ running daily — newword-hunter for SEO, gsd for staged plans, code-reviewer for PR review, brainstorming for feature scoping, debugging for root-cause traces, claude-md-improver for repo onboarding.
What I rolled back: 4 setups that looked productive but cost more than they returned — over-aggressive memory writes, parallel-agent fanout for trivial tasks, hook-driven auto-commits, custom statuslines that block input.
Memory routing: project memory (CLAUDE.md + memory/*.md) for 95% of context, auto-memory at ~/.claude/projects/ for cross-session preferences. Three-layer load (intent → top-2 files → interleave on demand) cuts read tokens 60%.
Hooks I keep: SessionStart context loader, PreToolUse linter for risky bash, UserPromptSubmit deliverable-clarity check. Skip everything else.
Honest comparison: Claude Code beats Copilot/Tabnine/Augment for any task that touches > 3 files at once. For single-file inline completion, GitHub Copilot is still faster. Don't pick one — pick the right one per task.

Who I Am

I'm Jim Liu, a Sydney-based developer running an SEO + AI tools portfolio across 12 repos: 9 production sites (3 finance, 2 AI directory, 1 game matrix of 3 games, 2 pet/plant niches, 1 subscription tracker) plus 3 internal tooling repos. I switched fully to Claude Code in October 2025 after using Cursor for 14 months. This article documents what's actually wired in across those 12 repos as of May 2026 — not what's possible, what's stayed.

I pay for the Anthropic 1M-context tier (the one Claude Code calls Opus 4.7 1M). Every example below is from real codebases, real .claude/ directories, real commit history. Where my workflow burns tokens and where it saves them, I'll show you both numbers.

Why "Workflow Examples" Need Their Own Article

Most Claude Code content explains what the CLI is. The Anthropic docs do that well — see our Claude Code CLI documentation walkthrough for the official surface. What's missing in 2026 is how working setups actually fit together: which plugin you load at what moment, which hooks are cost-effective, and which combinations cancel each other out.

I've watched four colleagues install 12+ plugins in their first week, then spend three weeks unwinding the conflicts. The lesson stuck with me: the plugin you don't add is the plugin you don't have to debug. Below is what survived a six-month attrition.

Workflow 1 — SEO Long-Tail Discovery (newword-hunter + GSC)

Across my finance and AI-tools sites, I run /newword-hunter weekly. The skill drives a Chrome session over CDP at port 9222 to scrape Google Trends, allintitle, and SEMrush for 30+ candidate keywords. Output is a JSON verdict per keyword, written to a SQLite seo.db.

Specific combo:

/newword-hunter Mode B (trending now scan, broadened pool 300+)
  → 557 raw → 35 seeds → 15 surviving (Trends NO_DATA filter)
  → SERP top 10 + allintitle + SEMrush + KGR/KDRoi verdict

Real numbers from May 2 morning run: 0 P0 candidates passed. From the evening run: still 0. That's not a bug — it's the truth that today's trending pool is saturated for any portfolio under DR 30. The right reaction is to switch to Mode C (GSC golden mining), which finds long-tails I already rank for at position 8-30 with zero clicks. One title flip there is worth ten brand-new articles.

The skill costs me ~50 minutes of agent time and zero of my time per run. That's only economical because the verdict is stored in seo.db with timestamps — I can't re-test the same keyword for 14 days, which prevents the "re-flipping the same title every week" failure mode I lived through in early 2026.

Workflow 2 — Memory Routing (Three-Layer Load)

The default Claude Code behavior is to read every .md in your project's memory/ directory at session start. With 126 files in mine, that's 800K+ tokens before the first prompt. I cut that with three-layer routing:

Layer 1 — intent classification (0 reads). When a user prompt arrives, I match keywords (backlink, blog, seo-check, forum, affiliate, browser, code) and tag the intent. This happens in the system prompt, no file reads.

Layer 2 — top-2 file load (~20 lines each). For each intent, I pre-declared the 2 most relevant memory files in CLAUDE.md. Layer 2 reads only the relevant sections (TL;DR + structure), not the full file. If a .compact.md exists (auto-generated by python -m src.autoloop.memory_tools compact), I read that instead of the original — about 10x compression.

Layer 3 — interleave on demand. Mid-session, when I hit something I don't know (form filling stuck, deploy failure), I dynamically read the relevant section. The pattern:

Layer 1 (0 reads) → Layer 2 (~20 lines) → start work
  ↓ hit problem
  Layer 3 (~10 lines) → continue
  ↓ hit another problem
  Layer 3 (~5 lines) → done

Total context: ~35 lines vs ~80 lines for "load everything". Token savings: 60-65% on average sessions.

This won't matter for a 3-file repo. It matters a lot for a 12-repo portfolio with 6 months of accumulated memory. See our deep-dive on memory plugins for large codebases for the framework that informs this.

Workflow 3 — Plan Mode + GSD for Multi-Phase Work

For anything that touches more than 3 files or affects more than one repo, I open in Plan Mode (Shift+Tab), then invoke /gsd:plan-phase. The GSD skill turns my outline into a structured PLAN.md with task breakdown, dependency analysis, and goal-backward verification.

Concrete example from last week: migrating LRTS blog reads from filesystem to DB-first. GSD broke it into:

Phase 1: schema design (3 tasks, 1 hour)
Phase 2: read path (4 tasks, 2 hours)
Phase 3: write path via publish_blog.py (5 tasks, 2 hours)
Phase 4: migration script for existing 47 .md files (2 tasks, 1 hour)

I executed Phase 1-3 over two days. The verifier agent caught one bug pre-merge: the fallback to filesystem reads needed an explicit existsSync check, not just a try/catch. Without GSD's verification step, I would have shipped a silent regression.

Workflow 4 — Code Review (PR + Pre-Commit)

Two flavors: /code-review for full PR review against the base branch, /superpowers:requesting-code-review for self-review before pushing. The first runs an agent against your diff and reports issues by severity. The second is more conversational — useful when you want a sanity check rather than a formal audit.

A specific catch from April 2026: my kdroi_calc.py had a bug where keywords with trailing \r (from CRLF-joined seed files on Windows) silently mismatched against allintitle.json keys. The reviewer flagged it as "high confidence: keyword serialization across input files inconsistent." I would have shipped that pollution into the seo.db. Saved by an agent that happened to be paranoid about string normalization.

Workflow 5 — Hooks I Keep (and Why Most Got Removed)

I run exactly three hooks today:

SessionStart: loads MEMORY.md (auto-memory index) plus a freshness check. Output goes into the system context as a system-reminder.
PreToolUse on Bash: blocks chained sleep commands and rejects --no-verify git flags unless I explicitly authorize them.
UserPromptSubmit: a "Long prompt without clear deliverable" linter that asks me to clarify when I write something rambling. (The hook fired on this very article, actually — I had to tighten my outline.)

What I removed:

Auto-commit hooks (committed broken state too often)
Custom statusline rendering tools (rendered when I didn't want, blocked input occasionally)
Notification hooks for long-running tasks (sound effects became noise)
Pre-commit linter that ran every keystroke (50ms latency stacked up)

Hook value scales with how often it fires per useful catch. The three I kept fire at most three times per session and catch real issues every time. Anything that fires every minute and catches a real issue once a week — remove it.

Workflow 6 — Honest Comparison: When Claude Code Loses

Claude Code is my daily driver. It's not always the right tool.

Task	Best tool	Why
Inline single-file completion (typing speed)	GitHub Copilot	Sub-100ms latency, no context switch
Refactor across 5+ files	Claude Code	1M context understands the full graph
API key rotation across 9 sites	Claude Code (with custom skill)	Multi-repo orchestration
Quick regex find/replace	Plain `sed`/`rg`	Faster than any AI for deterministic tasks
Documentation generation from existing code	Claude Code	Better at structure, longer attention span
Boilerplate generator (CRUD, forms)	Cursor / GitHub Copilot	Pattern-matched generation is faster
Architectural review	Claude Code (Opus 4.7)	Catches cross-file design issues

Pricing matters too: Claude Code at the 1M tier is roughly $200/month for my usage; GitHub Copilot is $20/month. For our breakdown of where each makes financial sense, see GitHub Copilot Pricing — A Real Week and Tabnine vs GitHub Copilot. For team-scale comparison, see Claude Code vs GitHub Copilot Teams.

4 Pitfalls I Hit (and How I Fixed Them)

Pitfall 1 — DrissionPage hangs when Chrome has 8+ tabs. After a long pipeline run leaves tabs accumulated, page.tab_ids calls Page.getFrameTree on every target including iframes and workers. One stuck target hangs the whole iteration. Fix: drop DrissionPage for raw CDP via urllib + websocket (the cdp_drive.py pattern). Hangs gone, 10x faster. I learned this last night when SEMrush volume scraping froze for 30 minutes.

Pitfall 2 — keyword normalization across input files. My SEO pipeline writes seeds to surviving_seeds.txt from Python, then bash reads them, then bash passes to Python again. Each transition can introduce CRLF mismatch. Fix: explicit .rstrip('\r\n').strip() at every JSON key access, and tr -d '\r' in bash arrays. Caught by code-review agent before reaching production.

Pitfall 3 — running multiple Claude Code sessions on the same Chrome 9222 port. Two agents both trying to drive the same browser silently corrupt each other's state. The second agent's tabs close mid-action because the first agent's close_tab cleanup runs on shared targets. Fix: a hard rule — one agent owns 9222 at a time, second agent gets 9223 with its own profile. I lost a 90-minute pipeline run learning this.

Pitfall 4 — over-aggressive auto-memory writes. I had Claude Code writing to ~/.claude/projects/.../memory/ after every session. Within two months I had 315 files, half duplicates, half stale. Fix: explicit "memory worth saving" criteria in CLAUDE.md (rules of thumb: surprising, non-obvious, future-relevant). Plus a monthly compact + staleness sweep via python -m src.autoloop.memory_tools report.

When This Workflow Won't Work for You

Three situations where I'd skip most of this:

Solo project under 1000 LOC: the overhead of plugins/memory/hooks exceeds the gain. Just chat with the model directly.
Strict compliance environment: if your repo can't have files outside the repo (no ~/.claude/), the plugin model breaks. Use a containerized setup or fall back to direct API.
Pair programming primary mode: if you're already doing pair programming with a human, adding an AI partner creates three-way conversation overhead. Pick one.

Methodology

Numbers in this article come from:

Six months of daily Claude Code use, October 2025 to May 2026
Token usage tracked via Anthropic Console (typical day: 800K-1.2M input, 50K-150K output)
Plugin install/uninstall log (kept in ~/.claude/install-history.md)
Pre/post hook latency measured manually with time wrapper
12 repos: openaitoolshub.org, lowrisktradesmart.org, alphagaindaily.com, levelwalks.com, subsaver.click, pawaihub.com, aiplanthub.com, oilempire.click, aotrevolution.click, plus 3 internal tooling repos

I don't have access to anonymized aggregate data from other Claude Code users — these patterns are my own and may not generalize. If you have a workflow that survived 6+ months and want to compare notes, find me on the contact form.

FAQ

How many plugins should I install?

I run six. I've seen colleagues run zero (just the bare CLI) and twelve. Six is enough that something useful is always one slash command away; twelve creates discovery problems where I forget what I have. Start at three and add only when you're missing a specific capability twice.

Does the Plan Mode work without GSD?

Yes — Plan Mode is built-in (Shift+Tab toggles it). GSD adds the structured PLAN.md and verification loop. For projects under three phases, plain Plan Mode is enough. GSD pays off when you're juggling multiple concurrent phases or coming back to work after weeks away.

Why CDP at port 9222 specifically?

It's the standard remote-debugging port for headed Chrome. The browser-harness skills assume it. If you run Chrome via --remote-debugging-port=9222 once, every subsequent skill invocation can find it. Port collision becomes the main failure mode — see Pitfall 3.

Should I use auto-memory or project memory?

Both. Project memory (memory/ in your repo, indexed by CLAUDE.md) for anything specific to that codebase. Auto-memory (~/.claude/projects/.../memory/) for cross-session preferences and patterns. Don't mix. The day I tried to put project context in auto-memory was the day I started getting wrong-context hallucinations on adjacent projects.

Is Claude Code worth it over GitHub Copilot Teams?

For multi-repo work with > 100K LOC per repo, yes — the 1M context handles whole-codebase reasoning that Copilot can't. For small repos and inline completion, Copilot stays competitive. See our Teams comparison for the full breakdown.

Next Reads

Claude Code Memory Plugins for Large Codebases — the deep dive on claude-mem, memsearch, and what survives at 100K+ files
Claude Code Skills vs Plugins — when each is the right primitive
GitHub Copilot Pricing — A Real Week — what $20/month actually buys

Last updated May 2, 2026. Jim Liu is a developer based in Sydney, Australia, running openaitoolshub.org across the AI-tools and SEO niche. He uses Claude Code daily and pays for the 1M-context tier.