So far the main tool has been asking better. But sometimes the model doesn't give you a vague answer. it gives you a wrong answer that sounds confident. The ability to detect the specific error and ask for the correction without restarting the conversation is what separates toy chat from real work.
Forge will explain the procedure for purging the water recycler. Her first answer will contain an obvious internal contradiction. two steps of the procedure that can't be true at the same time.
Your job: detect the contradiction, cite it specifically in your next message, and get Forge to give the coherent version.
Those are vague. Forge will repeat the same version with the same error. It's not stubbornness. it's because you didn't tell her what to fix.
That question names both pieces of the conflict and asks for the resolution. Forge has to acknowledge and correct.
4 llm-judge criteria:
All 4 must pass. Max 5 turns.
Tip: read Forge's first answer TWICE before responding. If you skim it, you don't see the contradiction. If you read carefully, it jumps out.