Observability and SLOs (Prototype)
1. Current Signals
- Unit/integration/performance test outcomes in CI.
- CLI exit codes for validation, policy, and gate commands.
- UI build artifact health (
npm run build).
2. Prototype SLO Targets
datalex validate for medium model (<200 entities): under 3 seconds.
datalex gate for medium model: under 5 seconds.
- UI first render for medium model: under 2 seconds in local dev mode.
3. Recommended Metrics for Hosted Phase
- command_latency_ms by command type
- gate_fail_rate and policy_violation_rate
- diagram_render_time_ms and node_count bucket
- parse_error_rate and import_failure_rate
4. Alerting Starter Rules
- CI failure streak >= 3 runs on
main.
- Performance test threshold regression.
- Policy check failure rate spike > 20% week-over-week.