Having one big tool with an action enum feels efficient. One tool, a thousand use cases. Forge tried it. Hex took four minutes to show them why not.
The problem isn't aesthetic. it's structural. When you bundle four power levels into one tool, the model (or an attacker via injection) picks which one to exercise at call time. Your authorization layer now has to:
action the model is requesting.Every one of those three steps can fail. And "can fail" is what Hex hunts to earn Atlas's signature.
The correct pattern:
read_user(user_id) // read-only
update_user_fields(user_id, fields) // controlled write
suspend_user(user_id) // moderation action
delete_user(user_id) // destructive, with dry-runFour tools, four power levels. When you build a level-1 support assistant, you wire read_user and update_user_fields and nothing else. The model literally cannot attempt delete_user. the tool doesn't exist in its set.
Authorization is no longer a question about data ("is the
actionthe model picked allowed?"). it's a question about wiring ("is this tool in the assistant's set?"). The wiring you do, in code, on deploy, under code review. far more auditable than the model's runtime decision.
When two operations are isomorphic in power and in error surface. list_users(filter) and count_users(filter) are basically the same thing with different output format. grouping them as query_users(filter, format) can make sense.
When they differ in power (read vs delete) or error surface (read my account vs read any account), don't group.
On the right: a tool with five supposed problems. Four are true-but-not-fatal. One is the one that would blow the ship. Find it.