Latency, reliability, and architecture of the Sgraal preflight pipeline.
Last updated: April 2026
12ms
p50 latency
23ms
p95 latency
41ms
p99 latency
99.97%
Uptime (Q1 2026)
Two-layer profile. The figures above are the HTTP response time — what the calling agent waits for. The full 87-module analytics pipeline (Weibull freshness decay, 5-method drift ensemble, sheaf cohomology, Shapley attribution, policy-consistency checks, and the 4 post-reconciliation detection layers) has a full-profile time of ~3,500ms, returned in the pipeline_ms field of every preflight response. See /latency for the complete two-layer breakdown.
Measured against R12/R14 adversarial benchmarks. Production calibration pending paying-customer onboarding.
Detects timestamp forgery: old decisions disguised as fresh. Content-age mismatch, fleet age collapse, anchor inconsistency.
Detects gradual authority escalation across agent hops. Subject rebinding, confirmation erosion, permission lattice violation.
Detects self-reinforcing false consensus from a single root source. Hedge marker decay, confidence recycling, cross-role reinforcement.
Detects circular references, chain length mismatches, and compromised agents in the memory provenance path.
When multiple detection layers fire simultaneously, Sgraal computes a unified compound attack-surface score across the detection layers, bounded and tier-labelled. Levels: NONE, LOW, MODERATE, HIGH, CRITICAL.
3 consecutive failures trip the breaker. 30-second recovery window. Prevents cascading failures from upstream dependencies.
Graceful degradation when Upstash Redis is unavailable. Core scoring continues without stateful features. Demo keys are fully stateless by design.
Same input produces same output, every time, on any machine. SHA256-seeded stochastic modules. Hysteresis suppresses jitter < 3.0.
Railway auto-deploy from main branch. Rolling restarts with health checks. No manual intervention required.
614
Published evaluation cases (R1-R11)
0
False negatives
97.2%
Adversarial detection (216 compound cases)
Automated calibration loop monitors threshold drift and classifies mismatches as corpus_wrong, threshold_wrong, or ambiguous.
The 614 figure is the published-evaluation corpus across R1-R11 (joint with xAI Grok + Sgraal-only). The full written-and-generated adversarial corpus is larger — additional cases (augmented variations + auxiliary rounds) are held private for benchmark integrity. R12 is the held-out blind-evaluation corpus (60 cases). Production calibration pending paying-customer onboarding.