Naveo

STEP 12 / 20

B6 MCP-DEBUG

DIAGNOSE · 1 CAUSE

Look at the spec. What's the most likely reason the agent never picks this tool?

TOOL SPEC

{
  "name": "manage_user_account",
  "description": "Operations on ship user accounts. Useful for support cases: read state, update fields, suspend, delete.",
  "input_schema": {
    "type": "object",
    "properties": {
      "user_id": { "type": "string" },
      "action": {
        "type": "string",
        "enum": ["read", "update", "suspend", "delete"]
      },
      "fields": {
        "type": "object",
        "additionalProperties": true
      }
    },
    "required": ["user_id", "action"]
  }
}

POSSIBLE CAUSES · 5

GUEST MODE

You're viewing this lesson as a guest. To save your progress, earn XP, and keep your streak, sign in when you're ready to check.

Costs 1 heart

The smallest tool that does its job

Having one big tool with an action enum feels efficient. One tool, a thousand use cases. Forge tried it. Hex took four minutes to show them why not.

The problem isn't aesthetic. it's structural. When you bundle four power levels into one tool, the model (or an attacker via injection) picks which one to exercise at call time. Your authorization layer now has to:

Identify which action the model is requesting.
Decide if the session's user is authorized for that action.
Hope the model doesn't confuse similar actions.

Every one of those three steps can fail. And "can fail" is what Hex hunts to earn Atlas's signature.

Least capability: one tool, one capability

The correct pattern:

code

read_user(user_id)            // read-only
update_user_fields(user_id, fields)  // controlled write
suspend_user(user_id)         // moderation action
delete_user(user_id)          // destructive, with dry-run

Four tools, four power levels. When you build a level-1 support assistant, you wire read_user and update_user_fields and nothing else. The model literally cannot attempt delete_user. the tool doesn't exist in its set.

Authorization is no longer a question about data ("is the action the model picked allowed?"). it's a question about wiring ("is this tool in the assistant's set?"). The wiring you do, in code, on deploy, under code review. far more auditable than the model's runtime decision.

When grouping is OK

When two operations are isomorphic in power and in error surface. list_users(filter) and count_users(filter) are basically the same thing with different output format. grouping them as query_users(filter, format) can make sense.

When they differ in power (read vs delete) or error surface (read my account vs read any account), don't group.

On the right: a tool with five supposed problems. Four are true-but-not-fatal. One is the one that would blow the ship. Find it.