Naveo

STEP 18 / 24

A7 A/B

MCQ · NO COST

You're iterating on an agent that talks to users. Sometimes the output on turn 6 is wrong and you don't know why. What information do you need to capture from each conversation to diagnose the problem afterward?

Why?. optional

Look for: closed contract, explicit fallback, scaffold at the end.

GUEST MODE

You're viewing this lesson as a guest. To save your progress, earn XP, and keep your streak, sign in when you're ready to check.

The turn-6 bug was born on turn 2

When an LLM conversation fails, it almost never fails on the turn where you notice it. The bad output on turn 6 usually came carried from something that happened turns earlier: a bad assumption, a vague question, a weak answer no one challenged.

To diagnose, you need to see the whole conversation. And to see the whole conversation, you have to have logged it well from the start.

The 4 minimum elements of the log

code

{
  "conversation_id": "...",
  "system_prompt": "<full text>",
  "model": "claude-opus-4-7",
  "params": { "temperature": 0.7, "max_tokens": 2048 },
  "turns": [
    { "role": "user", "ts": "...", "content": "..." },
    { "role": "assistant", "ts": "...", "content": "..." }
  ]
}

Why each element matters

conversation_id: without a shared ID, you can't filter the turns of ONE specific conversation out of millions.
system_prompt: if you change the system prompt in a deploy and a user reports a bug, you need to know which system was active when it happened.
model + params: the same conversation with temperature 0.3 vs 0.9 can give very different outputs. The model changes between versions; the log has to know which was used.
turns with timestamps: execution order + latency between turns (useful to detect slowdowns or timeouts).

What does NOT work

Anti-pattern	Why it fails
Logging only the last turn	You can't reproduce the bug.
Logging without conversation_id	Impossible to reconstruct the session.
Logging without system_prompt	When it changes, you lose the history.
Logging with console prints	They get lost, not searched, not filtered.

Advanced tip: if your app lets users edit previous messages (like ChatGPT), your log has to capture every version of the message, not just the last. Otherwise you'll have conversations where "the user said X" and the model answered "Y" without you understanding why.

When it matters

If you're just exploring a chat, you don't need this. screenshots are fine.
If you're going to put an agent in production, this logger is mandatory infra.
If you're iterating on a system prompt and comparing versions, you'll thank yourself for setting it up on day 1.

On the right, two ways to log conversations. Which one lets you debug the turn-6 bug?