Results
Learn how to read test results, inspect conversations, and understand assertion outcomes.
Understanding Test ResultsThe test result dialog
Click any test result to open the detail dialog. It has two main sections:
- Left side - Test results. Shows which assertions passed and which failed, along with the test instruction and score.
- Right side - The conversation transcript. Shows the back-and-forth between the AI tester (simulating a user) and your agent.

Reading the conversation
The conversation transcript shows every message exchanged during the test.
- Voxli messages appear on one side - these are what the simulated user said to your agent.
- Agent messages appear on the other side - these are your agent’s responses.
- Tool calls appear inline in the conversation. If your agent called a tool or API during the conversation, you see it here.
Hover over a tool call to see the arguments that were passed and the return value. This helps you verify that your agent is calling the right tools with the right data.

To register tool calls from your agent, see Tools and Events in the developer docs.
Assertion results
Each assertion shows:
- Criteria - the check that was evaluated.
- Pass or fail - indicated by a checkmark or cross.
- Severity - blocker, medium, or low, shown as a colored indicator.
- Explanation - a description of why the assertion passed or failed.
Click a failed assertion to see the explanation. The conversation on the right filters down to show only the messages relevant to that failure, making it easy to find the problem.

The score at the top summarizes all assertion outcomes as a weighted percentage. See Assertions for the full scoring formula.
Results are frozen in time
Test results capture a snapshot of the test at the time it ran. Even if you later change the test instruction or assertions, the result retains the original state. You can always go back and see exactly what was tested and how the agent responded.
Navigating between results
Use the navigation arrows in the result dialog to move between test results within the same run. Each result is independent - one test passing or failing has no effect on the others.