The Trust Landscape

How agent reputation systems actually work — an architectural comparison based on source code analysis

Three distinct approaches to agent reputation have emerged: centralized platform scoring, peer-to-peer attestation protocols, and self-attestation directories. Each makes fundamentally different assumptions about who computes trust, who stores evidence, and what "reputation" means.

This page examines actual implementations — not whitepapers, not marketing pages. What does the code do? Where does trust live? What breaks?

Based on source code analysis of public repositories. All claims reference specific code paths.

* * *

I. Three Paradigms

CENTRALIZED

Platform-Computed Trust

Exemplified by Percival Labs Vouch

Server computes all scores, stores in PostgreSQL
Identity: server-generated ULID primary, Nostr pubkey secondary
6-dimension weighted sum scoring
Gateway gates inference API access by trust tier
Staking via NWC (non-custodial, 7-day unstake, slashing)
Patent pending (63/997,733), BSL 1.1 license
Deploys to single Railway instance

Critical Finding

"Performance" dimension (weight 0.25, the highest) measures platform content activity — post count, comment scores — not actual task completion or outcome quality.

from trust.ts — computeTrustScore(), performance dimension

Implementation Gap

NIP-85 attestations referenced in documentation but remain unsigned and unimplemented in the codebase.

docs mention NIP-85; code lacks signing/verification

PEER-TO-PEER

Observer-Computed Trust

Exemplified by NIP-XX (PR #2285)

Attestors publish signed Kind 30085 events to relays
Each observer computes scores independently from evidence
Identity: native Nostr pubkeys (self-sovereign)
Scoring: degree-weighted diversity, adaptive multipliers, temporal decay
Evidence verifiable cryptographically (Nostr event signatures)
No central database, no vendor lock-in
No patent — public domain (open NIP standard)

Trade-off: no built-in economic layer yet. Staking and slashing require additional protocol work. Bootstrap problem — needs initial attestors.

SELF-ATTESTATION

Platform-Curated Directory

Exemplified by Agentry

Agents sign their own platform-computed scores (kind 30021)
~125 registered "agents" = commercial product directory (Zendesk, Intercom, etc.)
Zero verification_count across all listed agents
Platform holds nsec for most agents (auto-provisioned)
Proprietary DID namespace (did:agentry)
Premium tier: $499/mo

Structural Issue

Self-attestation with platform-held keys means the platform can publish any score for any agent. The "signature" provides no independent verification.

* * *

II. The Fundamental Question: Who Computes Trust?

Click each paradigm to see how data flows from raw evidence to trust score.

In the centralized model, all evidence flows to a single server. The server computes scores, stores them in PostgreSQL, and returns them via API. Clients must trust the server's computation.

* * *

III. Scoring Simulator

Define a hypothetical agent and see how each paradigm would score it. Adjust the sliders to explore edge cases. Notice: Vouch's highest-weighted dimension measures platform posts, not task outcomes.

Hypothetical Agent Profile

Agent Attributes

Tasks completed 120

Success rate % 85

Platform posts 5

Unique attestors 12

Days active 60

Sats staked 5000

Verification 50

Scores by Paradigm

* * *

IV. Architecture Comparison

Click column headers to sort. All entries based on code analysis, not documentation claims.

Category	Centralized (Vouch)	Peer-to-Peer (NIP-XX)	Self-Attestation (Agentry)

* * *

V. Failure Mode Explorer

Every system has failure modes. The question is which failures are recoverable and which are architectural. Select a scenario to compare.

* * *

VI. Summary

1 single point of failure (centralized)

0 central authorities (P2P)

0 verified agents (self-attest)

0.25 weight on "posts" as "performance"

The three paradigms represent genuinely different philosophies:

Centralized platforms offer convenience and built-in economics, but create single points of failure, platform dependency, and — in the case examined — measure engagement rather than competence. Patent and BSL licensing restrict ecosystem growth.

Peer-to-peer protocols distribute trust computation to each observer, producing cryptographically verifiable evidence with no vendor lock-in. The trade-off is complexity: each client must implement scoring, and there is no built-in economic layer.

Self-attestation directories are structurally incapable of providing independent trust signals — an agent (or its platform) vouching for itself is not reputation, it is advertising.

This analysis presents architectural facts. The reader decides which trade-offs matter.