Evaluate AI agent outputs for quality, safety, and cost. 12 deterministic eval rules, PII detection, cost tracking. Glama AAA rated. npx @iris-eval/mcp-server