Trust

Every external review, every finding, every remediation status — on one page.

We ran eight external review passes against this product on 2026-05-10. Four were security or code reviews of the live Cloudflare-tunneled memory service; the rest were sprint, architecture, and product-surface reviews. Every finding below is sourced from a real audit artifact in our internal _audit/ tree.

Snapshot date: 2026-05-10 · sources: A:\GRIFF_AI\01_INTERNAL\_audit\2026-05-10-*.md

What “trust” does not mean for us yet

We are an early-stage single-operator product. The following are explicitly NOT in place today, and we will not pretend otherwise on a sales call:

  • · No SOC 2 Type I or Type II report.
  • · No ISO 27001 certification.
  • · No formal third-party penetration test.
  • · No FedRAMP authorization (Low, Moderate, or High).
  • · No continuous monitoring / SIEM / Logpush at this tier.
  • · No 24/7 on-call rotation.
  • · No DPA / DPIA / GDPR Article 28 paperwork stack.
  • · Single operator. No dual control on production access.

The eight reviews listed below are LLM-driven code and architecture audits, not human-led pentests. We treat them as one evidence stream among many — not as a substitute for the certifications above. When we earn those certifications, we will publish them here. Until then, this page is what we have, and we publish it externally because we would rather you find our gaps on a webpage than in procurement.

Multi-vendor audit history — 2026-05-10

Eight external review passes across four vendors (Google, xAI, OpenAI, Anthropic) plus internal verification audits (AU-1, AU-2, AU-4). Each row links to the source audit file on disk; counts are point-in-time and the remediation columns are aggregated from the closure audits listed in section 2.

IDReviewer / modelTypeScopeFindingsClosedIn progressOpen / deferred
V-1
Google Gemini
gemini-2.5-flash
2026-05-10
code review
Python code review of memory_orchestrator HTTP + MCP server + embeddings
_audit/2026-05-10-final-drain-V1-gemini-review.md
3
0 CRIT / 0 HIGH / 1 MEDIUM / 2 LOW
201
V-2
xAI Grok
grok-3
2026-05-10
security review
Zero-trust security review of Cloudflare Access deployment on memory.griff.run
_audit/2026-05-10-final-drain-V2-grok-security-review.md
6
6 threat classes, P0/P1/P2 mix
213
V-3
OpenAI GPT-4o
gpt-4o-2024-08-06
2026-05-10
sprint review
External Sprint 7 gap analysis vs. RC1 freeze artifact bundle
_audit/2026-05-10-final-drain-V3-openai-sprint7-gap.md
5
5 gaps (intent accuracy, overreach, tests, customer surface, gates)
023
V-4
Anthropic Claude
claude-sonnet-4-6
2026-05-10
code review
Fourth-vendor code review of same three files V-1 audited; dark-corner pass
_audit/2026-05-10-final-drain-V4-claude-review.md
6
2 CRIT / 4 HIGH (+ MED/LOW continuation)
105
V-5
Anthropic Claude
claude-sonnet-4-6
2026-05-10
product review
Persona focus-group round 2 reactions to ship state (11 personas)
_audit/2026-05-10-final-drain-V5-claude-focus-v2.md
11
11 persona-level objections (customer-surface, federal, eval substrate)
038
V-6
Anthropic Claude
claude-opus-4-7
2026-05-10
sprint review
Cross-vendor Sprint 7 contract critique; honesty + customer-surface + net-new findings
_audit/2026-05-10-final-drain-V6-claude-sprint7-review.md
3
3 net-new launch-readiness findings (abuse, legal/identity, observability)
012
V-7
Anthropic Claude
claude-opus-4-7
2026-05-10
architecture review
Plan-level review of Brain build plan v2 (35-40 session DAG, sprint structure)
_audit/2026-05-10-final-drain-V7-claude-brain-plan-review.md
5
5 plan-level findings (deps, sprint merges, acceptance criteria, phase priority)
005
V-8
Anthropic Claude
claude-sonnet-4-6
2026-05-10
security review
Fourth-vendor CF Access posture review; adjudicate V-2 vs internal; federal-AV lens
_audit/2026-05-10-final-drain-V8-claude-cf-posture.md
12
7 must-fix-now + 5 net-new (NF-1..NF-5)
1011

Some V-2 findings appear in V-4 / V-8 columns as escalations or overrules; the per-finding adjudication is in section 2 below. V-3 / V-5 / V-6 / V-7 are non-security reviews (sprint, persona, plan); their counts reflect open product/launch items, not security findings.

V-8 must-fix-now items

V-8 Claude Sonnet 4.6 ranked these as the three highest-ROI items for federal-AV-middleware readiness. Status reflects the live tree on 2026-05-10.

N-23

Closed

Service-token 30-day rotation

All 4 connector service tokens rotated to fresh client_id+secret pairs with 720h (30d) lifetime on 2026-05-10. Old tokens DELETEd. Session duration reduced 24h → 8h in the same lane.

Evidence: _audit/2026-05-10-final-drain-N23-cf-hardening.md

IMPL-7

In progress

Origin rate-limit per CF-Access-Client-Id (slowapi)

V-8 must-fix-now item 2. Scoped at 3h: slowapi keyed on CF-Access-Client-Id with per-request audit logging in the same PR. Closes V-2 #5 (DLP) and V-2 #6 (per-connector audit attribution) jointly. Design committed; no code in tree yet — branch / PR not opened at time of this page.

Evidence: V-8 §4 item 2 (no shipped artifact in _audit/ at 2026-05-10 cutoff)

V-8 #3

Partial

Written threat-model document in repo

A 149-line threat-model lives at docs/threat-model.md in griffai-memory and covers the DNS-rebinding trade-off (V-1 MEDIUM / V-8 D3). It does not yet cover loopback-bypass rationale, accepted-risk register, or the federal-AV threat actors V-8 requires. Treated as partial pending the wider scope V-8 §4 item 3 demands (next sprint).

Evidence: A:\projects\github\griffin9899\griffai-memory\docs\threat-model.md

Grok V-2 security findings + V-8 net-new (NF-1..NF-5)

Six original V-2 threat classes plus five net-new V-8 findings. Severity shows V-2 → V-8 adjudication where applicable; status is mapped to the closure or non-closure evidence in _audit/.

V-2 #1

Deferred

Identity-spoofing — single OWNER email, OTP-only IdP

Severity: P0 / P1

P1 policy relies on a single email address authenticated via Cloudflare One-Time PIN. Compromise of the Gmail account (phish, SIM-swap, credential stuffing) yields full Access. Recommended fix: swap to Google IdP with auth_method=google. Operator-gated; not yet executed.

Evidence: N-23 CF hardening § "Deferred work" (Google IdP swap)

V-2 #2

Closed

Service-token 1-year lifetime + no rotation runbook

Severity: P0

Four connector service tokens had 1-year lifetimes (expiring 2027-05-11) with no rotation schedule. Reduced to 30-day lifetime + fresh client_id+secret pairs on 2026-05-10. Old tokens DELETEd. Blast radius reduced from 365 days to 30 days.

Evidence: N-23 CF hardening — 4 tokens rotated 8760h → 720h on 2026-05-10

V-2 #3

Deferred

Bearer-replay — static origin secret, no nonce or expiry

Severity: P1 (V-2) → P3 (V-8 overrule)

Origin bearer is constant-time-compared but static. V-2 proposed short-lived JWT. V-8 overruled to P3: replay requires bearer extraction + TLS break + replay inside the CF tunnel (mTLS cloudflared↔edge). Revisit only if bearer ever transits outside the CF tunnel boundary.

Evidence: V-8 adjudication §1 row B3 — downgraded to P3, negative ROI today

V-2 #4

Open

Loopback-bypass on MCP middleware

Severity: P0 (V-2) → env-gate (V-8 third option)

V-2 demanded full removal (P0). Internal stack reconciled as intentional design (no escalation). V-8 overruled both: env-gate default-off (2 lines + 1 test) eliminates the audit-flaggable pattern while preserving dev-convenience. Not yet implemented. The internal threat-model document does NOT yet cover loopback bypass rationale.

Evidence: V-8 §2 adjudication; no GRIFFAI_MCP_ALLOW_LOOPBACK_BYPASS flag in tree yet

V-2 #5

In progress

Missing origin rate-limits — DLP-failure exposure

Severity: P0

No CF rate-limiting (Free tier) and no origin throttle. A leaked service token enables bulk DB extraction with no signal. Planned fix: slowapi keyed on CF-Access-Client-Id (3h estimate). Closes V-2 #5 (DLP) + V-2 #6 (audit-attribution) in one PR. Not yet shipped.

Evidence: V-8 must-fix-now §4 item 2 — scoped, design committed, no code shipped

V-2 #6

In progress

Audit-trail gap — no per-connector attribution at origin

Severity: P1 (V-2) → P0 (V-8 federal-AV lens)

No per-connector logging at origin; cannot tell which of the 4 service tokens issued which request without paid-tier CF Logpush. V-8 escalated to P0 under NIST SP 800-53 AU-2/AU-3/AU-12 baseline for FedRAMP Moderate. Remediation bundled with rate-limit work above.

Evidence: V-8 escalation table; bundled with V-2 #5 remediation

V-8 NF-1

Open

SQLite WAL-mode + busy_timeout for concurrent writes

Severity: MEDIUM (M-H under fed-AV)

Silent corruption risk on the 232k-item memory DB under concurrent writes. Safety-adjacent for an AV reasoning chain. Effort: 2h code + 1h test.

Evidence: V-8 §3 NF-1; deferred to before multi-connector concurrent workload

V-8 NF-2

Open

Windows Defender firewall rule has no IaC / regression guard

Severity: LOW (P1 under fed-AV)

Load-bearing LAN restriction (192.168.1.0/24 inbound on 8787/8788) has no script, no verification, no regression guard. Effort: 1h to ship deploy/Set-OriginFirewall.ps1.

Evidence: V-8 §3 NF-2

V-8 NF-3

Open

No route-dependency regression guard on verify_bearer

Severity: LOW (architectural)

No pytest fixture that asserts every FastAPI route still has Depends(verify_bearer) wired. Future route addition could silently skip bearer auth. Effort: 1h.

Evidence: V-8 §3 NF-3

V-8 NF-4

Open

A:\ drive type unverified — secrets on potentially-removable drive

Severity: MEDIUM

.cf-service-tokens.local.json lives on A:\ which per persistent memory is the spinner-tier backup drive. If hot-swap/external, physical extraction bypasses NTFS ACLs. 5-minute verification not yet performed.

Evidence: V-8 §3 NF-4; cross-reference to feedback_jwgh02_drive_tier_policy.md

V-8 NF-5

Open

cloudflared.exe — no SHA-256 / Authenticode verification, version 0.0.0.0

Severity: MEDIUM (P0 under EO 14028 §4)

Binary self-reports version 0.0.0.0 (manual drop, no auto-update). No pre-flight hash verification. EO 14028 §4 software supply-chain traceability requires version + patch SLA. Effort: 1h.

Evidence: V-8 §3 NF-5; cloudflared was manual-drop, not winget

V-1 Gemini code findings

Gemini 2.5 Flash code review of the memory_orchestrator HTTP and MCP server. Three findings; two closed via N-24, one partial pending the wider threat-model scope V-8 demands.

V-1 MEDIUM

Partial

DNS-rebinding protection disabled (mcp.settings.transport_security = None)

Severity: MEDIUM

Evidence: docs/threat-model.md authored 2026-05-10 (V-8 D3 demands deeper scope; loopback rationale missing)

V-1 LOW-1

Closed

/health leaks SQLite DB path to unauthenticated traffic

Severity: LOW

Evidence: N-24 FIX-1 shipped in commit e304514; /health now returns only {"ok": true, "service": "griffai-memory"}

V-1 LOW-2

Partial

vector_recall(db: Any) type ambiguity / threading footgun

Severity: LOW

Evidence: Signature tightened to sqlite3.Connection | str | None in commit b4882dd; V-8 A3 escalates threading-violation surface to MEDIUM

V-4 Claude code findings (dark-corner pass)

Claude Sonnet 4.6 fourth-vendor pass on the same three files V-1 audited. Found two CRITICAL data-integrity hazards and four HIGH severity items missed by V-1 / V-2. CRIT-1 closed via IMPL-8; the rest open and tracked.

V-4 CRIT-1

Closed

Heterogeneous blob concatenation silently corrupts the recall matrix

Severity: CRITICAL

Evidence: IMPL-8 shipped per-row blob validation + new test_embeddings_load.py

V-4 CRIT-2

Open

Little-endian float32 assumption unportable and unvalidated (no magic byte)

Severity: CRITICAL

Evidence: No endianness column, no schema CHECK, no round-trip identity unit test

V-4 HIGH-1

Open

Import-time os.environ.setdefault poisons sibling processes and test isolation

Severity: HIGH

Evidence: embeddings.py lines 44-45 still execute setdefault at import

V-4 HIGH-2

Open

_load_all_vectors — 357 MB unbounded allocation; OOM-DoS amplifier

Severity: HIGH

Evidence: No module-level matrix cache; b"".join(bufs) double-buffers ~714 MB peak

V-4 HIGH-3

Open

Module-level app + build_app() mutates shared mcp.settings singleton

Severity: HIGH

Evidence: mcp_http_server.py:106-108 still mutates global singleton; no restore path

V-4 HIGH-4

Open

Logic duplication between MCP and FastAPI bearer middleware (drift hazard)

Severity: HIGH

Evidence: BearerAuthMiddleware.dispatch + _bearer_auth — single-point-of-fix not consolidated

Where to go next

Live status

/honest-status publishes the RC1 freeze tag, Pliny red-team posture, test counts, and the live known-limitations list. The page links to /honest-status.json for machine consumption.

Federal readiness

/federal tracks Section 889 attestation, FedRAMP path, OMB M-24-10 inventory support, and CMMC L2 control packaging. V-8 escalated several findings under a federal-AV lens; the federal page is the long-form companion.

Security posture

/security describes the bounded-authority architecture, scoped agent identity, policy decisions, and audit-evidence pipeline. Owner-gated production controls are documented under NDA via the contact path.

We commit to publishing every external review here as we run them, whether the review compliments the product or embarrasses it. The point of this page is that there is nowhere for us to hide a finding: the same audit files we read internally are the source of truth for everything you see above.