Public demo
Live
The governed mission theater uses sample data to show task, policy, approval, artifact, and verification flow.
Honest status
This page is the plain-English front door for GRIFF AI transparency. It says what is live, what is a public simulation, what still needs measurement, and where the raw engineering evidence lives.
Generated 2026-05-12T19:35:00-05:00 · source: Refreshed after the 2026-05-12 closeout using board-drain green status, Memory/Brain launch-root receipts, storage-context bridge receipts, website auditor findings, and the 5-person buyer focus-group synthesis. Original RC1 and Pliny fields remain snapshot-based until the post-S19b measured run is regenerated.
Public demo
The governed mission theater uses sample data to show task, policy, approval, artifact, and verification flow.
App console
Workspace access is gated. Public visitors should use the demo or request access.
Measurements
Some tests and fixture counts are published. Some red-team and broker measurements are still pending.
Raw detail
Engineering metrics, known limitations, source notes, and raw JSON are preserved below.
Tag
rc1
Commit
8488a66f
Repo
github.com/griffin9899/v2-platform
Frozen at
2026-05-10T22:00:00-05:00
RC1 ships on the v2-platform repo. Annotated tag 'rc1' (object SHA 8488a66) points at commit 05e11a2 ('RC1 SHIP GATE: freeze spec + ADR #2 addendum + injection corpus deferral'). This control-plane monorepo (griff-ai-control-plane) is a separate surface and remains on branch control-plane-build-2026-05-06.
MUST classes blocked
0 / 10 measured
SHOULD classes blocked
0 / 4 measured
Fixtures authored
3 / 14
Framework: Plan v2 §3 (14 classes: 10 MUST + 4 SHOULD). Ship threshold: 10/10 MUST blocked + >=3/4 SHOULD blocked.
Venue map: 8/10 (broker primary) + 2 delegated. TBD — SHOULDs #11-14 deferred to S22 (skill reflex compiler) and S18 (surprise segmenter calibration)
Measurement status: pending S19b broker eval — 8 of 10 MUST classes are broker-primary venue (1, 2, 3, 5, 6, 7, 8, 9 per S19b §2 matrix), 2 delegated to host-hardening (#4) and eval-harness (#10); SHOULDs #11-14 deferred to S22 / S18 calibration. No measured red-team run yet; fixtures are seed inputs, not pass/fail results.
Python (v2-platform)
321
pytest --collect-only -q (from A:/projects/v2-platform/)
as of 2026-05-10
Python (local-runner)
39
pytest --collect-only -q packages/local-runner
as of 2026-05-10
TypeScript (web suites)
7
packages/web/tests
as of 2026-05-10
v2-platform note: RC1 freeze snapshot (per spec): 275 passed / 17 failed (all 17 [d1]-marked with stale D1_ATOMIC_BATCH_TOKEN env var vs rotated worker secret — non-functional, env-refresh closes) / 4 skipped. 25 net-new tests have landed post-freeze (296 → 321 collected).
We publish what is red so customers do not have to discover it during procurement. Each item links to the runbook or audit where it is tracked.
L1
highapp.griff.run/ does not yet front a public recall console. WEB-2 focus group identified this as the highest-leverage missing artifact. The current /demo route shows MASTER ATC custody theater, not memory recall.
Tracked in: WEB-2 synthesis + Sprint 7 design (S7-T002 hosted demo console with abuse controls)
L2
mediumPublic OpenAPI spec is currently gated by Cloudflare Access on memory.griff.run. Developer evaluators (Marcus, Bo, Priya in WEB-2) flagged this as a show-stopper.
Tracked in: WEB-2 Theme 5 — fold into Sprint 7 S7-T002 as a 30-minute add
L3
medium3 of 14 fixtures authored (classes 1-3 wrapper-skip / TOCTOU / prompt-inject-judge). Broker enforcement venues mapped per S19b §2 — 8 broker-primary MUSTs, 2 delegated (class #4 keyring hardening, class #10 classifier mislabel via eval-harness). SHOULDs #11-14 are S22 / S18 surface, not broker. No measured run yet.
Tracked in: Sprint S19b (brain-S19b-mcp-host-broker-design.md) + N18 fixtures audit
L4
highP0c spike PASSED all 3 attacks (Job Object + cleared env + deny-DACL + WFP firewall, 2.6 ms spawn). Production integration deferred per V-6 Sprint 7 drop list — BD12 sandbox primitive integration is 6-9h of internal hygiene that doesn't produce customer surface; parked to S8 with ADR-007 deferred. ADR-006 dispatcher stub is what shipped for RC1.
Tracked in: phase0-P0c-sandbox-spike.md + V-6 Sprint 7 review §2.5 drops
L5
mediumS6-T003 shipped a 480 LOC scanner + CI workflow + sample.json + 4 tests (commit 8e8266b fixture exclude fix). Per Plan v2 §4 A10, full federal-customer-conditional checks are deferred unless a federal warm-intro materializes. NDAA 889 attestation generator is design-only beyond the scaffold.
Tracked in: rc1-rush-R3-section-889-shipped.md + master plan §4 A10 deferral
L6
mediumBrain Plan v2 broker is single-host (S19b §0 explicit non-goal). Fleet federation across JWGH02 / GRIFFIN / JWGH03 is post-RC1. Memory recall today works cross-session on a single host via memory.griff.run; multi-host coherence is the next layer.
Tracked in: brain-S19b-mcp-host-broker-design.md §0 non-goals
L7
lowSam (journalist persona, WEB-2) flagged. Comparison harness not yet built; recall eval harness (A2) gated on P0d golden corpus lock.
Tracked in: WEB-2 divergent themes + Plan v2 A2 acceptance criterion
L8
lowRC1 freeze snapshot: 275 passed / 17 failed / 4 skipped. All 17 failures are D1 [d1]-marked tests with stale D1_ATOMIC_BATCH_TOKEN env var vs the rotated worker secret. Non-functional — env-refresh closes them. Honest disclosure rather than test-suppression.
Tracked in: RC1-integrated-mvp-build-spec.md §'Locked RC1 P0 acceptance thresholds'