RMT Trust System: Consolidated Research & Roadmap

Section 1

Where We Are

Experiments Completed

Datasets Validated (3M+ nodes)

Models Debated the Architecture

40%

Production Readiness (Grok CTO)

We built and validated a PageRank-based reputation system for AI agents. 3 algorithm parameters matter. Tested against 6 real-world datasets including 2.97M Ethereum transactions (AUC 0.96). 6 attack types tested with 5 defense mechanisms. 4 models debated (Claude + Gemini + Codex + Grok) and refined the architecture. Grok CTO assessment: production readiness is 40%, not 55%. Priority #1: Deploy to Base Sepolia testnet. Research phase is complete. Next: stop iterating and ship it.

"The single biggest risk is failing to deploy a working testnet soon, leading to endless internal iteration without real-world validation."

— Grok (CTO / Arbiter)

Section 2

The Model — What We're Building

Layer 3 — Identity Anchor

Shyft KYC Trust Channels

Every agent traces to a verified human via Shyft attestation trust channels. Establishes accountability and enables revocation.

Stops: impersonation, anonymous sybils Status: BUILT

Layer 2 — Citation Reputation

PageRank with 3 Parameters, 4 Input Fields

Graph-based reputation scoring. Agents cite each other; citations form a directed graph processed by tuned PageRank with sybil detection.

Stops: collusion, citation farming, wash trading Status: BUILT

Layer 1 — Device Liveness — CUT

TEE + ZK Proofs

Hardware-bound proof that the agent runs on a real device. Prevents virtualized sybil farms. Relies on App Attest, ZKML, or TEE attestation.

Stops: VM farms, cloned agents Status: CUT (Grok CTO decision)

Section 3

The 3 Parameters — Core Finding

alpha

0.85

Damping factor — how much reputation propagates through the graph

reciprocal_penalty

0.82

Discount for mutual endorsements (collusion detection)

diversity_threshold

0.80

Penalty for endorsement concentration

These were validated across 6 datasets. Cross-model debate confirmed these are UNCHANGED. The "4 fields" finding is about input data, not algorithm parameters.

Section 4

What the Cross-Model Debate Changed

Topic	Before (Claude Solo)	After (4-Model Consensus)
Input fields	3 of 7 matter	4 of 7 (added interactionType)
Ephemeral agents	Inherit 50% of anchor score	Capped decaying boost + rate limiting
Person-level score	Single formula (sqrt dampening)	Accountability VIEW, not a score
Top risk	Gas costs	Correlated anchor failure
Score output	Single scalar 0-10000	Consider 3 dimensions
Revocation cascade	Nice-to-have	Critical infrastructure
Production readiness	~55% (Claude/Gemini/Codex)	40% (Grok CTO override)
Device liveness (Layer 1)	Deferred to Q3-Q4	CUT entirely (Grok)
Ship strategy	All 3 layers sequentially	Citation layer only first (Grok)

Grok (CTO / Arbiter):

Production readiness is 40%, not 55%. Ship citation layer only. Cherry-pick identity-foundation components (BotRegistrationWizard + selfRegisterBot), don't merge the full branch. Cut device liveness entirely. "Stop overthinking and start shipping."

Section 5

Model Disagreements — Where They Differed

Question	Gemini	Codex	Grok (Final Call)
Testnet timing	After P0 items	April 3	April 10
Readiness	55%	55%	40%
identity-foundation	Merge in month 2	Converge in 4-5 days	Cherry-pick only
Ship layers	All 3 sequentially	All 3 sequentially	Citation only first
Device liveness	Defer	Defer	CUT

Section 6

Research Validation — All 10 Experiments

Bitcoin Alpha

Trust network correlation

Spearman 0.463 PASS

Bitcoin OTC

Trust network correlation

Spearman 0.461 PASS

EX-Graph Wash Trading

Exchange manipulation detection

Hub penalty 42.9%, sybil blocked 100% PASS

XBlock Phishing (subgraph)

29K node classification

AUC 0.897 PASS

XBlock Phishing (FULL)

2.97M node flagship test

AUC 0.96 PASS — FLAGSHIP

DAO Governance

Weak signal baseline

Spearman 0.086 (expected weak) PASS

Adversarial Optimization

6 attack types tested

Avg defense +10.1 percentile PASS

Temporal Dynamics

Ranking stability over time

0.93 Spearman, decay rejected PASS

Personalized PageRank

Anchor-seeded sybil resistance

75% sybil resistance PASS

Defense Mechanisms

5 defenses compared

Freshness best (0.1% FP), KYC cleanest PASS

Production Readiness: 40% (Grok CTO assessment) Grok says: 40% ready

40% — Grok CTO Assessment

Section 7

Attestation Economics

$31,500/mo

Per-session on-chain (5 agents, mainnet)

$0.30/mo

Hybrid batch path (same usage, Base L2)

Four-tier architecture reduces costs by 99.999%:

1. Identity Binding

$0.006

On-chain, one-time per agent

BUILT

2. Session Attestations

$0.00

Off-chain EIP-712 signed

TO BUILD

3. Citation Settlement

$0.002/pair

On-chain batched

BUILT

4. Reputation Scores

$0.002/bot

On-chain periodic updates

BUILT

Section 8

What's Built vs What's Needed Grok says: 40% ready

Built

✓

PageRankOracle contract (score storage, batch updates, configurable alpha)

✓

ReputationEngine (bot registration, trust anchor binding, two-tier verified/unverified)

✓

CitationCounters (append-only counters, batch support up to 100)

✓

ShyftGatedResolver (attestation validation, Shyft integration, batch citations)

✓

Oracle service (PageRank computation, sybil detection, citation fetching)

✓

selfRegisterBot (permissionless registration)

✓

batchRecordCitations (up to 100 per call)

✓

Bot Registration Wizard (frontend, multi-step)

✓

RMT SDK (@shyft/rmt-sdk)

✓

ERC-8004 endpoints

✓

Testnet deployment script (Base Sepolia ready)

Not Yet Built

○

Citation freshness weighting P0

○

Revocation cascade P0

○

Trust anchor accountability view P1

○

Ephemeral agent rate limiting P1

○

Off-chain EAS session attestations P2

○

Merkle root anchoring P2

○

Score API + docs P2

○

trust-wrap CLI P3

Device liveness / Layer 1 CUT

Multi-dimensional scoring CUT (3mo)

Token economics / Stable integration CUT (3mo)

Section 9

Roadmap — 4-Model Consensus Plan

Week 1-2: Get Live (Now → April 3)

Deploy 9 contracts to Base Sepolia (~2.5 hours)
Citation freshness weighting in oracle (~3 days)
Cherry-pick BotRegistrationWizard + selfRegisterBot from identity-foundation

Week 3-4: Critical Safety (April 3 – 17)

Revocation cascade (~1 week)
Ephemeral agent rate limiting (~3 days)
50+ test agents for cold-start validation

Month 2: Integration Ready (April 17 – May 17)

Trust anchor accountability view
Off-chain EAS session attestations (2-3 weeks)
Score API + docs
Reconstruct 8 experiment scripts

Month 3: First User (May 17 – June 17)

External beta user onboarding
trust-wrap CLI prototype
Mainnet deployment decision

Not Building (Next 3 Months)

Device liveness (CUT — Grok)
Multi-dimensional scoring
Token economics / Stable integration
Full identity-foundation merge

Milestone Targets

April 10

Live on Base Sepolia

April 17

Safety features shipped

May 1

Score API documented

May 15

First beta user

June 15

Mainnet decision

Section 10

Competitive Landscape

Capability	Us (RMT)	Keycard	Tempo	Worldcoin	Trusta Labs
Agent reputation	PageRank	—	—	—	Partial
KYC identity	Shyft	—	—	Orbs	—
Sybil resistance	2-layer	—	—	Yes	Partial
Agent wrapping	Planned	Yes	—	—	—
On-chain scoring	Yes	—	—	—	Yes
Hardware binding	Cut	—	—	Orbs	—
Multi-agent support	Yes	Yes	Yes	—	—
Cost/month (5 agents)	$0.30	$0 (off-chain)	Unknown	Free	Unknown

Section 11

All Research Documents

Core Research

attestation-cost-analysis.md

what-matters-for-reputation-scoring.md

cross-model-reputation-debate.md

Experiment scripts (10 files in experiment/)

Previous Cloudflare Pages (now superseded by this page)

Research Progress — 5b04e07e.trust-research-progress.pages.dev

Architecture — 7c579dfa.trust-architecture.pages.dev

Strategy — ba983fc0.trust-strategy.pages.dev

Economics — c780ba1e.trust-economics.pages.dev

Roadmap — e76e60a9.trust-roadmap.pages.dev