Security

Last updated: April 2026

See also: Threat Model — what Sgraal preflight catches, what it does not replace, and what it complements. Includes the explicit "not certified by SOC 2 / ISO / etc." disclosure.

Reporting a Vulnerability

If you discover a security vulnerability in Sgraal, please report it responsibly. Do not disclose vulnerabilities publicly until we have had a chance to address them.

hello@sgraal.com

We aim to respond within 24 hours. Critical vulnerabilities within 4 hours.

Important — compliance evidence: /v1/check verdicts are not included in the audit trail and do not generate W3C Verifiable Credentials. For compliance-grade evidence (HIPAA, GDPR, EU AI Act, FDA 510(k), NIST AI RMF), use /v1/preflight, which produces full audit-log entries and signed W3C VCs per verdict. The /v1/check endpoint is suitable for experimentation and high-frequency agent gating only.

Infrastructure Security

API Layer

  • ✓ TLS 1.3 on all endpoints
  • ✓ API key authentication
  • ✓ Rate limiting per key
  • ✓ Request signing (enterprise)

Data Storage

  • ✓ AES-256 encryption at rest
  • ✓ EU-region Supabase
  • ✓ Row-level security (RLS)
  • ✓ Automated backups

Access Control

  • ✓ Principle of least privilege
  • ✓ MFA on all admin accounts
  • ✓ Audit log for all access
  • ✓ API key rotation support

Network

  • ✓ Cloudflare DDoS protection
  • ✓ Railway isolated containers
  • ✓ No inbound SSH in production
  • ✓ Upstash Redis with TLS

Zero-Knowledge Preflight

Memory content never leaves your infrastructure. SHA-256 proof hash returned instead of content.

POST /v1/preflight/zk

Use when: GDPR, HIPAA, data residency requirements.

Proof of Decision

Every preflight response includes cryptographic proof fields.

  • input_hash — SHA-256 of the input
  • proof_version — v1
  • deterministic: true — same memory state returns the same verdict, in the absence of feedback events that update internal learning state (see /docs/determinism.md)
  • reproducible: true — audit trail for every agent action

Compliance Profiles

EU AI Act

Article 9 (Risk Management), 12 (Record-keeping), 13 (Transparency), 14 (Human Oversight), 17 (Quality Management)

HIPAA

§164.312 safeguards — access controls, audit controls, integrity verification

FDA 510(k)

Medical-device software validation, substantial equivalence framing

GDPR

Data minimization, right to explanation, EU data processing

NIST AI RMF

Govern, Map, Measure, Manage — dedicated reference endpoint

Bit-identical replay for legal admissibility

Every Sgraal verdict can be replayed byte-identically months or years later, given the same input and the same scoring configuration — in the absence of feedback events that update internal learning state (qualified below). This is not a marketing claim — it is a measured property of the production scoring engine, validated by audit and exposed via the public API.

What it provides

A regulator, auditor, or counter-party can request the original decision be re-run. With the same memory state, action context, and scoring configuration fingerprint, Sgraal returns the same decision, the same risk score, and the same explanation — within the floating-point precision of the runtime.

Why it matters for regulated industries

Fintech, medical, legal, and defense customers face regulators who can subpoena the basis for any automated decision. Without byte-identical verdict replay, "the model said no" is unfalsifiable. With it, every decision is a reproducible experiment.

How it works (high level)

Sgraal's primary scoring engine is per-call deterministic with calibrated constants. Same input plus same configuration produces the same output, every time, in any process. The configuration itself is fingerprinted via a public checksum so customers can verify which scoring regime was active when their decision was made.

An explicit qualification: the guarantee holds in the absence of feedback events that update internal learning state. Customers needing strict per-call replay can opt in to a flag that disables the cross-call learning surface entirely. See our public determinism doc for the full contract.

Sample legal use case

Scenario: a regulator audits an automated underwriting decision from 18 months ago. The customer is asked to demonstrate that the decision was deterministic, explainable, and based on documented inputs.

Sgraal-enabled response: the customer pulls the original memory state and action context from their audit log, calls Sgraal's preflight endpoint with the historical scoring configuration fingerprint, and produces the same decision, score, and decision-trail attribution as the original. The audit closes in hours rather than weeks of forensic reconstruction.

Pairs with the Comply surface (NIST AI RMF MEASURE-4.1, EU AI Act Article 13 transparency, GDPR Article 22 right-to-explanation) and the Insights diagnostics that explain every decision.

Responsible Disclosure

hello@sgraal.com

We respond within 24 hours. Critical vulnerabilities within 4 hours.

Beta · Early Access

For CISOs & Security Teams

AI agents are the new
insider threat surface.

Tampered provenance, identity drift, consensus collapse — structural manipulations already happening in production. The detection layers ship today; the red-team service is in Beta.

CISOs joining the design partner program get founder-direct access and shape the roadmap.

What works today

Production-grade detection primitives.

layers

6 detection layers

Live

Timestamp integrity, identity drift, consensus collapse, provenance chain, sync bleed, confidence calibration — each layer post-reconciliation, cannot be overridden by single-signal escalation.

vaccines

Fleet vaccine propagation

Live

Sub-second attack signature propagation across the fleet. AES-256-GCM encrypted at rest. One tenant's incident becomes the entire fleet's immunity within seconds.

person_search

Behavioral profile per agent

Live

Call frequency trend, action-type escalation pattern, domain switching — the classic insider threat signals, applied to AI agents. Compromised or drifting agents surface before they cause incidents.

stream

SSE streaming (live monitoring)

Live

/v1/preflight/stream emits 23 events across 4 phases. Live SOC dashboards, real-time agent decision visibility, tap-into-any-event observability.

join_inner

Twin entries detection

Live

Correlated-entry detection. When an attacker plants several near-duplicate versions of the same memory, single-instance checks miss the cluster. Twin Entries flags the correlated / twin entries.

target

R12 corpus + counterfactual

Live

R12 adversarial corpus held privately for benchmark integrity (51/60 current; PARKED cases public). Counterfactual BLOCK confirmation proves the model is predictive, not just retrospective.

In active development

Two Beta features for security design partners.

flip_to_back

Reverse Adversarial Generator

Beta

Generate adversarial inputs targeting your specific agent. The same mathematical engine (DirectLiNGAM + drift_detector + omega_mem) that detects attacks can be inverted to produce them — a red team service for your CISO.

Status: core inversion works on test corpora. The customer-specific tuning (point the generator at your agent's actual production memory state distribution) is being shaped with design partners.

pause_circle

In-Decision Human Veto

Beta

Pause an agent's execution mid-decision for SOC review on flagged high-stakes actions. Hooks into the live SSE decision stream; veto callback halts before commit. Security officer in the loop without rewriting agent code.

Status: SSE streaming is GA; the veto-callback handshake is in test. Design partners get the first integration and influence the callback semantics.

Design Partner Program

First 5–10 CISOs shape the roadmap.

Founder-direct Slack access. Beta features ship based on what security design partners actually use. Locked-in pricing through Beta. Written case study at GA (with your approval and disclosure terms).

Apply to the program →

No NDA required to apply. 48-hour response.

Memory DNA forensics

Concept

Topological signature attribution: identify which model produced a memory state. Long-term direction shared with the litigation use case.

More on /forensics →

info Honest disclosure

Today: 6 detection layers, fleet vaccine, behavioral profile, SSE streaming, twin entries, R12 + counterfactual.

Beta: Reverse Adversarial Generator, In-Decision Human Veto. Both work in test; production hardening is what design partners are shaping.

Concept: Memory DNA forensics. Long-horizon direction.