So far every tool has been an isolated case. The capstone is the whole MCP: 3 tools covering a domain (critical parts inventory), designed so an agent picks them by itself, without extra instructions from the user.
Forge has a record of critical replacement parts. She wants an agent to be able to:
Three tools cover the whole domain. Your job: design the full MCP JSON Schema, with descriptions precise enough that the agent knows which one to use for each request, without you having to orient the conversation.
8 criteria. the first 5 deterministic (structure, correct required), the last 3 LLM-judge over the quality of the descriptions:
list_inventory_items.category is optional (NOT in required).add_inventory has 3 required.mark_item_low_stock has 2 required.| Unit | Skill | How it shows up here |
|---|---|---|
| 1 · What a tool is | The 3 tools have name + description + parameters | |
| 2 · Schema craft | Clear names, correct required, enums where they apply, documented outputs | |
| 3 · Descriptions the LLM reads | Criteria 6, 7 and 8: read vs write, when NOT to use, composition hint | |
| 4 · Production handlers | Applies to the real handler (not this schema). Think idempotency: add_inventory called 2× should add 2×, not 1×. And actionable error messages. | |
| 5 · Composition and catalog | Criterion 8: the list_inventory_items → mark_item_low_stock pattern has to be suggested | |
| 6 · State, scope, debugging | Think: what scope does each tool need? list is read-only, the other two modify state. Does the agent have write authorization? |
If your MCP passes the 9 deterministic and judge criteria, you can trust an agent will use it well without you having to babysit every conversation.
When you pass, Forge signs your first MCP. You're cleared to design real tools. the ones used in production, not in demos.