How agent reputation systems actually work — an architectural comparison based on source code analysis
Three distinct approaches to agent reputation have emerged: centralized platform scoring, peer-to-peer attestation protocols, and self-attestation directories. Each makes fundamentally different assumptions about who computes trust, who stores evidence, and what "reputation" means.
This page examines actual implementations — not whitepapers, not marketing pages. What does the code do? Where does trust live? What breaks?
Based on source code analysis of public repositories. All claims reference specific code paths.
Exemplified by Percival Labs Vouch
Exemplified by NIP-XX (PR #2285)
Trade-off: no built-in economic layer yet. Staking and slashing require additional protocol work. Bootstrap problem — needs initial attestors.
Exemplified by Agentry
Click each paradigm to see how data flows from raw evidence to trust score.
In the centralized model, all evidence flows to a single server. The server computes scores, stores them in PostgreSQL, and returns them via API. Clients must trust the server's computation.
Define a hypothetical agent and see how each paradigm would score it. Adjust the sliders to explore edge cases. Notice: Vouch's highest-weighted dimension measures platform posts, not task outcomes.
Click column headers to sort. All entries based on code analysis, not documentation claims.
| Category | Centralized (Vouch) | Peer-to-Peer (NIP-XX) | Self-Attestation (Agentry) |
|---|
Every system has failure modes. The question is which failures are recoverable and which are architectural. Select a scenario to compare.
The three paradigms represent genuinely different philosophies:
Centralized platforms offer convenience and built-in economics, but create single points of failure, platform dependency, and — in the case examined — measure engagement rather than competence. Patent and BSL licensing restrict ecosystem growth.
Peer-to-peer protocols distribute trust computation to each observer, producing cryptographically verifiable evidence with no vendor lock-in. The trade-off is complexity: each client must implement scoring, and there is no built-in economic layer.
Self-attestation directories are structurally incapable of providing independent trust signals — an agent (or its platform) vouching for itself is not reputation, it is advertising.
This analysis presents architectural facts. The reader decides which trade-offs matter.