MCP Agent Toolkit | Shubham Prajapati

The problem: three failure modes in every multi-agent system

After 18 months building production agent pipelines, the same three failures appear in almost every system:

Agents can't share state without coupling. Agent A finishes its subtask. Agent B needs the output. Without a shared store, you pass data through the orchestrator, hard-code direct calls between agents, or serialize state to a file and hope nothing races. All three approaches break as soon as you add a third agent or retry a failed step.
The same error fires again and again. An agent hits a JSONDecodeError. You spend an hour tracing it, add json_repair(), and move on. Three runs later, a different agent hits the same error. You trace it again. SCAR failure memory means the fix is written down once and found instantly next time.
Identical LLM requests hit the API twice. Evaluation loops, retry logic, and parallelized agents repeatedly issue the same prompt to the same model. SHA-256 cache key means the second call costs nothing.

Tool group 1: Blackboard shared state

The blackboard pattern decouples agents from each other. Agents write artifacts to a shared SQLite store keyed by run_id + agent + key. Any downstream agent reads by those same keys without knowing which upstream agent wrote them.

// Researcher agent writes its findings
blackboard_write({
  run_id: "run-abc123",
  agent: "researcher",
  key: "research_brief",
  value: { topic: "...", sources: [...], summary: "..." }
})

// Coder agent reads without knowing who wrote it
blackboard_read({
  run_id: "run-abc123",
  agent: "researcher",
  key: "research_brief"
})

All three tools write to an artifacts table in node:sqlite. Reads return the latest value for a given key, so retry-safe: a failed agent can re-run and overwrite without corrupting downstream reads. blackboard_list lets an orchestrator inspect what any agent has produced before deciding which downstream agents to activate.

Tool group 2: SCAR failure memory

SCAR stands for Situation → Cause → Action → Resolution. The two tools implement a simple hash-addressed failure index:

// Before retrying a failed agent, check SCAR first
const known = scar_lookup({
  agent: "coder",
  error_type: "JSONDecodeError",
  context: failedOutputSnippet.slice(0, 200)
})
// { found: true, resolution: "wrap output in json_repair() before parsing" }

// After finding a fix, record it
scar_record({
  agent: "coder",
  error_type: "JSONDecodeError",
  context: failedOutputSnippet.slice(0, 200),
  resolution: "wrap output in json_repair() before parsing"
})

The hash key is SHA-256(agent + "::" + error_type + "::" + context[:200]). The same failure signature always resolves to the same DB row. The context field is optional — omit it to match the error class broadly across all contexts.

Tool group 3: LLM response cache

Cache key is SHA-256(JSON.stringify({ messages, model })). Identical requests — same messages array and model string — hit the cache instead of the API. Useful in three specific scenarios: evaluation loops that re-score the same answers, parallel agents running overlapping subtasks, and development iteration where prompts stabilize before the system does.

// Check cache before calling LLM
const hit = cache_get({ messages, model: "claude-opus-4-8-20260528" })
if (hit.hit) return hit.response   // free

// After calling LLM, store response
cache_set({
  messages,
  model: "claude-opus-4-8-20260528",
  response: llmResponse,
  provider: "anthropic"
})

Interactive: Tool Call Tracer

Pick a scenario and see which tools fire and in which order in a real agent pipeline run.

Scenario

Agent handoff (blackboard) Error with known fix (SCAR) Repeated LLM call (cache) Error with no fix yet (SCAR)

Tool call trace

Select a scenario and click Run trace.

Why node:sqlite, not a separate DB process

Node.js 22 ships node:sqlite — a synchronous SQLite binding with no native compilation, no Docker dependency, no external process. The toolkit creates a single data/toolkit.db file on first run. The three tables (artifacts, scars, llm_cache) are created with CREATE TABLE IF NOT EXISTS at startup. This means the server works identically in development and CI without any setup beyond npm install && npm start.

How to install

git clone https://github.com/shubham0086/mcp-agent-toolkit
cd mcp-agent-toolkit
npm install
npm start         # server runs on stdio, ready for clients

Wire into Claude Desktop

Add this to claude_desktop_config.json (~/Library/Application Support/Claude/ on Mac, %APPDATA%\Claude\ on Windows) and restart Claude Desktop. The seven tools appear in the tool picker immediately.

{
  "mcpServers": {
    "agent-toolkit": {
      "command": "node",
      "args": ["/absolute/path/to/mcp-agent-toolkit/src/server.js"]
    }
  }
}

Run the tests

npm test

# 13 tests:
# tests/blackboard.test.js  — write/read isolation per run_id, list returns correct keys
# tests/scars.test.js       — lookup miss, record, lookup hit, hash consistency
# tests/cache.test.js       — miss on first call, hit on identical request, hash collision resistance

Where this fits

The blackboard, SCAR, and cache patterns originated in AgentKernel (equilibrium) — a six-engine runtime for production multi-agent systems. The standalone Agent-Scars and Agent-Recall repos implement the same patterns without the MCP layer. This toolkit is the MCP-native version: same patterns, any client, no integration code required.

Honest framing

The cache is hash-exact, not semantic. Two prompts with minor wording differences that would produce identical outputs get cached separately. For semantic deduplication, you'd need an embedding similarity check before the hash lookup — that's a reasonable extension but not in this version. The SCAR lookup also requires the error signature to match within the first 200 characters of context; errors with highly variable context strings (stack traces, dynamic data) may not hit the cache even when the underlying cause is the same.