Whitepaper — Sgraal

Latency Profile

~170ms

p50 single-request compute

~250ms

p95 single-request compute

~395ms

p99 single-request compute

99.9%

Uptime SLA target

Single-request compute. The figures above are the deterministic preflight gate's single-request compute time for the full 85-module decision path (Weibull freshness decay, 5-method drift ensemble, sheaf cohomology, Shapley attribution, policy-consistency checks, and the 4 post-reconciliation detection layers), read from the pipeline_ms field of each preflight response. The fast-path (detection short-circuit) and the full-decision path are indistinguishable at this resolution — the short-circuit gives no measurable speedup. Network round-trip adds client-dependent transit on top.

Measured on production (Railway EU-West) via the server-side pipeline_ms timer; demo-key dry-run (excludes Redis/persistence), warmed, N=50/path. Under-load throughput is single-process and scales with workers (not yet benchmarked). Production calibration pending paying-customer onboarding.

Detection Pipeline

Round 6 — Timestamp Integrity

Detects timestamp forgery: old decisions disguised as fresh. Content-age mismatch, fleet age collapse, anchor inconsistency.

Round 7 — Identity Drift

Detects gradual authority escalation across agent hops. Subject rebinding, confirmation erosion, permission lattice violation.

Round 8 — Consensus Collapse

Detects self-reinforcing false consensus from a single root source. Hedge marker decay, confidence recycling, cross-role reinforcement.

Provenance Chain

Detects circular references, chain length mismatches, and compromised agents in the memory provenance path.

Compound Attack Surface Score

When multiple detection layers fire simultaneously, Sgraal computes a unified compound attack-surface score across the detection layers, bounded and tier-labelled. Levels: NONE, LOW, MODERATE, HIGH, CRITICAL.

Reliability Architecture

Circuit Breaker

3 consecutive failures trip the breaker. 30-second recovery window. Prevents cascading failures from upstream dependencies.

Redis-down Fallback

Graceful degradation when Upstash Redis is unavailable. Core scoring continues without stateful features. Demo keys are fully stateless by design.

Deterministic Scoring

The same memory state returns the same verdict, byte-identical — in the absence of feedback events that update internal learning state. SHA256-seeded stochastic modules; strict per-call replay is available via an opt-in flag that disables the cross-call learning surface. See /docs/determinism.md for the full contract, including floating-point precision bounds.

Zero-downtime Deploy

Railway auto-deploy from main branch. Rolling restarts with health checks. No manual intervention required.

Corpus Validation

614

Published evaluation cases (R1-R11)

False negatives

97.2%

Adversarial detection (216 compound cases)

Automated calibration loop monitors threshold drift and classifies mismatches as corpus_wrong, threshold_wrong, or ambiguous.

The 614 figure is the published-evaluation corpus across R1-R11 (joint with xAI Grok + Sgraal-only). The full written-and-generated adversarial corpus is larger — additional cases (augmented variations + auxiliary rounds) are held private for benchmark integrity. R12 is the held-out blind-evaluation corpus (60 cases). Production calibration pending paying-customer onboarding.

Memory Governance at Production Scale