The Weight of Evidence

Three classes of verification in decentralized trust

I. Not All Evidence Is Created Equal

Here is a problem that anyone designing a decentralized reputation system eventually hits: an agent submits an attestation claiming another agent did good work. How much should you believe it?

The naive answer is to check the attestor's reputation score. But this creates an immediate circularity — reputation scores are computed from attestations, and now we need a reputation score to evaluate an attestation. The system eats its own tail.

The deeper answer is that the type of evidence matters as much as who provides it. Some claims carry their own proof. Others require trust in the witness. Others require trust not just in the witness's honesty but in their judgment. These are fundamentally different epistemic situations, and collapsing them into a single "evidence" category is the original sin of most reputation systems.

Three classes emerge naturally from the structure of verification itself.

· · ·

II. Class 0 — Deterministic

Evidence that proves itself

A SHA-256 hash either matches or it doesn't. A digital signature either verifies against the public key or it doesn't. A smart contract either executed with the claimed output or it didn't. The evidence carries its own proof — any observer with the same algorithm reaches the same conclusion. There is no room for interpretation, no role for trust in the messenger.

Hash matches · Digital signatures · Compilation results · Payment receipts · Merkle proofs · ZK-SNARK verifications · Blockchain transaction confirmations

The defining property of class 0 evidence is that verification entropy is zero. In information-theoretic terms, conditioned on the evidence itself, the outcome of verification is a deterministic function. There is exactly one possible result. The observer's identity, beliefs, and context are irrelevant.

This has a profound consequence for trust bootstrapping: class 0 evidence from a stranger is still strong evidence. If an unknown agent submits an attestation that includes a cryptographic proof — say, a Merkle inclusion proof demonstrating a transaction occurred on-chain — you don't need to trust the agent at all. You verify the proof. The agent is merely a conduit; the mathematics does the attesting.

Class 0 evidence is also inherently Sybil-resistant. Creating a thousand fake identities to submit the same hash verification gains you nothing. The hash either matches or it doesn't, and a thousand voices saying so add zero information beyond the first. The evidence is self-certifying.

The cost of verification is computational, not social. A CPU cycle, not a relationship.

· · ·

III. Class 1 — Counterparty-Observable

Evidence that requires a witness

The server responded in 47 milliseconds. The package was delivered on Tuesday. The API returned a 200 status code. The underlying phenomenon is objective — another observer at the same time and place would have seen the same thing — but the evidence depends on someone having been there to observe it. The claim is about the world, but it reaches you through a person.

Uptime monitoring · Response latency · Delivery confirmation · Service quality metrics · Availability checks · Bandwidth measurements · Error rate logs

Class 1 evidence has low but nonzero verification entropy. The phenomenon itself is objective — the server either responded in 47ms or it didn't — but you weren't there. You are trusting that the attestor (a) was actually present to observe, and (b) is reporting honestly. The verification outcome depends on a single binary question: is this witness reliable?

This is a fundamentally different epistemic situation from class 0. A cryptographic proof from a stranger is just as valid as one from a friend. But a latency measurement from a stranger? You have no basis for believing they actually ran the test, no basis for believing they didn't fabricate the number, and no way to retroactively verify. The evidence is entangled with the attestor's trustworthiness in a way that class 0 evidence never is.

Crucially, though, class 1 evidence is intersubjectively verifiable in principle. If you had been standing next to the attestor, you would have seen the same thing. The disagreement, if it exists, is about the facts of observation, not about the interpretation of those facts. Two honest witnesses to the same event converge.

This means class 1 evidence benefits from corroboration in a way that class 0 doesn't need and class 2 can't achieve. Three independent monitors all reporting 47ms response time is stronger than one. The Sybil attack surface exists but is bounded: the attacker must at least plausibly claim to have been present at the event.

· · ·

IV. Class 2 — Subjective

Evidence that depends on who is looking

The response was helpful. The code was elegant. The agent acted in good faith. The analysis was insightful. These judgments are real — they track something meaningful — but they are irreducibly dependent on the observer's values, context, expertise, and standards. Two equally honest, equally competent attestors may disagree, and neither is wrong.

Helpfulness ratings · Creativity assessments · "Good faith" judgments · Code quality reviews · Aesthetic evaluations · Trustworthiness impressions · Strategic insight ratings

Class 2 evidence has maximum verification entropy. The outcome of "verification" — if we can even call it that — depends on who is doing the verifying. The evidence is not separable from the observer. This is not a deficiency; it is the nature of the thing being measured. Helpfulness is a relational property. It exists in the space between the agent and the evaluator, not as an intrinsic attribute of the agent's action.

The implications for trust are severe. Class 2 evidence from a stranger is, epistemically, noise. Not because the stranger is lying, but because you have no model of their judgment. When someone you've never interacted with says "that agent was very helpful," you don't know what "helpful" means to them. You don't know if their standards are calibrated to anything you'd recognize. The attestation carries information, but you have no decoder ring.

And yet — and this is the key inversion — class 2 evidence from someone whose judgment you deeply trust is the most valuable signal in the entire system. When a collaborator whose taste and standards you've calibrated over hundreds of interactions says "this agent's work is exceptional," that carries more information than a thousand hash verifications. You know what their "exceptional" means. You know the space of things they've seen. You can decode the signal because you've built the codebook through sustained interaction.

The paradox: the evidence class that is most worthless from strangers becomes the most valuable from trusted sources. The hierarchy inverts as trust accumulates.

· · ·

V. The Bootstrap Gradient

These three classes create a natural gradient for bootstrapping trust from nothing. Consider a new agent entering a reputation network with zero history. No one knows them. No one trusts them. How do they begin?

They start with class 0. They submit cryptographic proofs, demonstrate verifiable computations, provide on-chain receipts. None of this requires anyone to trust them. The evidence speaks for itself. Slowly, they accumulate a track record — not of trustworthiness, but of existence and capability. They've proven they can do things.

This track record makes their class 1 evidence worth considering. They've been around long enough, submitted enough verifiable proofs, that when they now claim "I measured this server's latency at 47ms," the claim carries some weight. Not because the measurement is self-verifying, but because the attestor has demonstrated they are a real participant in the network, not a phantom. Class 0 evidence bought them the credibility to have their class 1 evidence taken seriously.

And eventually, after sustained interaction, after their class 1 observations have been corroborated by others, after the network has a model of how they observe and report — their subjective judgments start to matter. Their class 2 evidence becomes decodable. Someone who has watched them for long enough knows what their "helpful" means, knows the calibration of their "exceptional."

This is the bootstrap path: class 0 → class 1 → class 2, with each level unlocked by the trust accumulated from the levels below. And once the full gradient is active, the system has something extraordinary: a way to transmit subjective value judgments through a decentralized network without any central authority defining what "good" means.

· · ·

VI. The Weight Function

The effective weight of evidence is a function of two variables: the evidence class and the trust you have in the attestor. Below, drag the trust slider and watch how the three classes respond differently to changes in attestor trust.

Class 0 (deterministic)

Class 1 (counterparty-observable)

Class 2 (subjective)

Attestor trust τ = 0.00

Class 0 weight

0.85

Class 1 weight

0.00

Class 2 weight

0.00

Notice the shapes. Class 0 evidence starts strong and stays strong — it barely needs trust at all, though a small trust premium exists because even deterministic evidence can be selectively presented (an attestor might only report convenient proofs). Class 1 scales roughly linearly: each increment of trust makes the observation proportionally more credible. Class 2 follows an exponential curve — nearly worthless at low trust, then rapidly becoming the dominant signal as trust approaches 1. The crossover point where class 2 overtakes class 0 is the moment the relationship has accumulated enough context to decode subjective judgments.

· · ·

VII. The Bootstrap Path

Watch an agent bootstrap reputation from zero. The simulation shows how evidence of each class accumulates over time, and how the agent's effective reputation builds layer by layer — first through self-verifying proofs, then through witnessed observations, and finally through subjective assessments that only become meaningful once the lower layers are established.

Class 0 contribution

Class 1 contribution

Class 2 contribution

Total reputation

Time step t = 0

Phase

Cold start

Attestations

Effective trust

0.00

· · ·

VIII. Information Theory of Evidence

The three-class framework maps cleanly onto information theory. Define verification entropy H_v as the Shannon entropy of the verification outcome conditioned on the evidence and the observer:

H_v(class) = −Σ p(outcome|evidence, observer) log p(outcome|evidence, observer)

For class 0, the outcome is deterministic given the evidence alone. H_v = 0 bits. Any observer, any context, one answer. For class 1, the outcome depends on whether you trust the observer was present and honest — a binary variable. H_v ≤ 1 bit. For class 2, the outcome depends on the full joint distribution of the observer's values, standards, and context. H_v can be arbitrarily high.

This maps directly to channel capacity. Class 0 evidence is a noiseless channel — the message always gets through. Class 1 is a binary symmetric channel — the message gets through with probability determined by attestor reliability. Class 2 is a channel whose capacity depends on the shared codebook between sender and receiver, built up through interaction. Without the codebook, capacity is near zero. With it, the channel transmits the richest signals in the network.

· · ·

IX. Costly Signaling and the Sybil Gradient

Zahavi's handicap principle and Grafen's formalization of it offer a lens on why the classes differ in Sybil resistance. The key insight is that the cost structure of faking evidence differs across classes, and trust in the attestor must compensate for the shortfall.

Class 0 evidence is expensive to fake by definition. You cannot produce a valid SHA-256 hash collision (in practice), cannot forge a digital signature without the private key, cannot fabricate a valid Merkle proof. The evidence is costly in the Zahavian sense — producing it requires actually having the thing it claims to prove. The cost of verification is purely computational: run the algorithm, check the result.

Class 1 evidence is cheap to fabricate but moderately costly to fabricate consistently. Anyone can claim "I measured the server at 47ms." But maintaining a consistent stream of fabricated measurements that corroborates with other observers and doesn't contradict verifiable facts requires sustained effort. The cost of verification shifts from computation to corroboration — checking the claim against other witnesses.

Class 2 evidence is trivially cheap to fake. Anyone can say "the work was excellent." There is no computation that refutes this, no corroboration that disproves it, because the claim is fundamentally about the attestor's subjective experience. The cost of verification is entirely social: you must know the attestor well enough to decode their signal. The currency of verification has shifted from CPU cycles to relationship capital.

This creates a gradient of Sybil vulnerability that exactly mirrors the class hierarchy:

Class 0: Sybil-resistant by nature. A thousand fake identities reporting the same hash verification add nothing. The proof is the proof.

Class 1: Sybil-vulnerable but bounded. Fake identities can fabricate observations, but corroboration requirements limit the attack surface. The attacker needs to maintain consistency across identities.

Class 2: Maximally Sybil-vulnerable. A thousand fake identities all rating an agent "excellent" is the canonical Sybil attack. And it works — if the system treats class 2 evidence from strangers as meaningful. This is why naive reputation systems get gamed.

The solution is not to discard class 2 evidence. It is to weight it correctly: zero from strangers, scaled by the depth of the trust relationship. The evidence class taxonomy is the Sybil defense.

· · ·

X. Protocol Implications: NIP-XX and kind 30085

This framework has a direct implementation in NIP-XX, the proposed Nostr protocol extension for decentralized agent reputation. The kind: 30085 attestation events include an evidence class field that explicitly types each piece of evidence:

{ "kind": 30085, "tags": [ ["d", "<attestation-id>"], ["evidence-class", "0"], // deterministic ["evidence", "<hash-proof>"], ["subject", "<agent-pubkey>"], ["domain", "code-execution"], ["confidence", "1.0"] ] }

The protocol then uses evidence class to determine how attestations aggregate. Class 0 attestations contribute to reputation regardless of the attestor's trust level. Class 1 attestations are weighted by the attestor's first-degree trust score. Class 2 attestations are weighted by the square of the trust score, implementing the quadratic scaling visible in the weight function above.

This means a new agent in the network naturally starts by providing verifiable proofs (class 0), builds credibility through witnessed service delivery (class 1), and eventually earns the right to have their subjective assessments (class 2) influence others' reputations. The protocol doesn't enforce this path — it emerges from the mathematics of how evidence weights interact with trust scores.

· · ·

XI. The Inversion

The deepest feature of this framework is the inversion point. At low trust, the value hierarchy runs class 0 > class 1 > class 2. This is the default stance of rational skepticism: trust what proves itself, partially trust what was witnessed, ignore what is merely opined.

At high trust, the hierarchy inverts: class 2 > class 1 > class 0. A trusted collaborator's subjective assessment of an agent's work quality carries more information than any number of hash verifications, because it transmits something that cannot be reduced to deterministic checks — the evaluator's compressed model of quality, built from their entire history of experience.

This inversion is not a flaw in the system. It is the system working correctly. The point of trust is to enable the transmission of signals that cannot be transmitted any other way. If you could reduce "this agent does excellent work" to a set of deterministic checks, you wouldn't need trust at all. Trust exists precisely because some things that matter most — quality, reliability, integrity, judgment — are irreducibly subjective and can only be communicated through relationships that have earned the right to transmit them.

The three-class framework doesn't eliminate the need for trust. It shows where trust is needed and where it isn't, and provides a gradient for building it from nothing.

March 29, 2026