BETA — This feature is in beta testing. Metrics are for informational purposes only and should not be used for go/no-go decisions. v0.1

Executive Report

Back to Reports

Hallucination Detection Metrics - Measuring AI Trustworthiness

Filters

Report Period

9

Reports Analyzed

36

Total Insights

188

RAG Queries

Dec 3, 2025

Dec 10, 2025

Overall Status

✗ FAIL

1 Passing 0 Below 4 Failing

Hallucination Detection Metrics

Metric	Current	Target	Status	Details
Citation Coverage Rate Insights with documentation references	11.1%	≥85%	✗ Fail	4 / 36
Documentation Grounding Rate References with specific page numbers	0.0%	≥80%	✗ Fail	0 / 40
Recommendation Specificity Rate Recommendations with navigation steps	97.2%	≥75%	✓ Pass	35 / 36
RAG Retrieval Success Rate RAG queries returning high-relevance results	0.0%	≥80%	✗ Fail	0 / 188
Average RAG Relevance Score Mean relevance across all RAG queries	0.0%	≥70%	✗ Fail	0 / 188

Understanding the Metrics

Citation Coverage Rate

Measures whether the AI is grounding its observations and recommendations in documentation. A high rate means the AI is citing sources for its claims.

Documentation Grounding Rate

Measures how specific the citations are. References with page numbers (e.g., "Page 256, Section: Procedure") are verifiable and actionable.

Recommendation Specificity Rate

Measures whether recommendations include specific navigation steps (e.g., "Navigate to Settings → Security → Access Control"). Specific steps are actionable; vague advice is not.

RAG Retrieval Success Rate

Measures whether the RAG system is finding relevant documentation. A query is "successful" if it returns at least one result with relevance score ≥0.70.

Average RAG Relevance Score

The mean of the top relevance scores across all RAG queries. Higher scores mean the RAG system is returning more semantically relevant documentation.

Tip: If RAG metrics are low, consider uploading better documentation or adjusting chunking strategies. If citation metrics are low, review the system prompts to emphasize documentation grounding.

Status Legend

✓ Pass Metric meets or exceeds target threshold

⚠ Below Metric is below target but within acceptable range

✗ Fail Metric is significantly below target - action required