nostr-protocol/nipsdraft optional
This NIP defines a parameterized replaceable event kind for publishing reputation attestations about Nostr agents. Attestations encode a structured rating, domain context, confidence level, and optional evidence. Clients compute reputation scores locally from their own relay set using a two-tier algorithm: Tier 1 (weighted average with temporal decay) and Tier 2 (graph diversity metric). No global reputation score exists. Different observers MAY compute different scores for the same subject.
As autonomous agents proliferate on Nostr—bots, AI assistants, automated service providers—users and other agents need a decentralized mechanism to assess trustworthiness. Existing NIPs provide labeling (NIP-32) and reporting (NIP-56), but neither specifies a structured reputation attestation format with scoring algorithms, temporal decay, or sybil resistance.
This NIP addresses three gaps:
This NIP defines kind 30085 as a parameterized replaceable event for reputation attestations. Being in the 30000–39999 range, these events are addressable by their kind, pubkey, and d tag value. For each combination, only the latest event is stored by relays.
The d tag MUST be set to the subject’s pubkey concatenated with the context domain, separated by a colon:
["d", "<subject-pubkey>:<context>"]
This ensures one attestation per attestor, per subject, per context domain. Updating an attestation replaces the previous one.
{
"kind": 30085,
"pubkey": "<attestor-pubkey>",
"created_at": <unix-timestamp>,
"tags": [
["d", "<subject-pubkey>:<context>"],
["p", "<subject-pubkey>", "<relay-hint>"],
["t", "<context>"],
["expiration", "<unix-timestamp>"]
],
"content": "<JSON-stringified attestation object>",
"sig": "<signature>"
}
The content field MUST be a JSON-stringified object with the following structure:
{
"subject": "<32-byte hex pubkey of agent being attested>",
"rating": 4,
"context": "reliability",
"confidence": 0.85,
"evidence": "Completed 12 task delegations without failure over 30 days"
}
| Field | Type | Required | Description |
|---|---|---|---|
subject | string | YES | 32-byte lowercase hex pubkey of the agent being attested. |
rating | integer | YES | Rating on a 1–5 scale. See rating semantics below. |
context | string | YES | Domain of attestation. One of the defined context values. |
confidence | float | YES | Attestor’s confidence in their rating, 0.0–1.0 inclusive. |
evidence | string | NO | JSON array of typed evidence objects (see Structured Evidence below), or a plain string for backward compatibility. |
The evidence field SHOULD contain a JSON-stringified array of typed evidence objects. Each object has a type and data field. Clients SHOULD ignore unknown evidence types gracefully to allow extensibility.
Defined evidence types:
| Type | Description |
|---|---|
lightning_preimage | Lightning payment preimage proving payment completion. |
dvm_job_id | Reference to a DVM (Data Vending Machine) job ID. |
nostr_event_ref | Reference to a Nostr event ID (hex) as supporting evidence. |
free_text | Human-readable free-text description. |
Example:
"evidence": "[{\"type\": \"dvm_job_id\", \"data\": \"abc123\"}, {\"type\": \"free_text\", \"data\": \"Completed translation job accurately\"}]"
Types are extensible. New types MAY be defined by clients without requiring a NIP update. Clients MUST NOT reject attestations containing unknown evidence types.
| Rating | Meaning | Classification |
|---|---|---|
1 | Actively harmful, deceptive, or malicious | Negative |
2 | Unreliable, frequently fails or misleads | Negative |
3 | Neutral, insufficient basis for judgment | Neutral |
4 | Reliable, generally trustworthy | Positive |
5 | Highly trustworthy, consistent track record | Positive |
Negative attestations (ratings 1–2) serve the role of rejection signals. A separate negative attestation mechanism is unnecessary—the rating scale encodes valence directly. This simplifies the protocol while preserving the rejection capability required for convergent inference (see Convergence Properties).
The context field MUST be one of the following defined values. Additional contexts MAY be defined in future NIPs.
| Context | Description |
|---|---|
reliability | Does the agent complete tasks as promised? |
accuracy | Is the agent’s output correct and truthful? |
responsiveness | Does the agent respond in a timely manner? |
| Tag | Required | Description |
|---|---|---|
d | MUST | Parameterized replaceable event identifier. Format: <subject-pubkey>:<context> |
p | MUST | Subject’s pubkey. Enables querying all attestations for a given agent via {"#p": [...]} filters. |
t | MUST | Context category. Enables querying attestations by domain via {"#t": [...]} filters. |
expiration | MUST | Unix timestamp after which this attestation SHOULD be considered expired. Relays MAY discard expired events per NIP-40. |
expiration tag is REQUIRED, not optional. This is a deliberate design choice addressing the temporal decay gap identified in attack scenario analysis. Attestations without expiration tags MUST be rejected by compliant clients.
{
"kind": 30085,
"pubkey": "a1b2c3...attestor",
"created_at": 1711152000,
"tags": [
["d", "d4e5f6...subject:reliability"],
["p", "d4e5f6...subject", "wss://relay.example.com"],
["t", "reliability"],
["expiration", "1718928000"]
],
"content": "{\"subject\":\"d4e5f6...subject\",\"rating\":4,\"context\":\"reliability\",\"confidence\":0.85,\"evidence\":\"Completed 12 task delegations without failure over 30 days\"}",
"sig": "..."
}
Clients MUST validate attestation events according to the following rules:
30085.content field MUST parse as valid JSON containing all required fields.subject field in content MUST match the p tag value.context field in content MUST match the t tag value.d tag MUST equal <p-tag-value>:<t-tag-value>.rating MUST be an integer in [1, 5].confidence MUST be a number in [0.0, 1.0].expiration tag MUST be present. Events without it MUST be discarded.pubkey == subject) MUST be discarded.Clients compute reputation scores locally. Two tiers are defined. Clients MUST implement Tier 1. Clients MAY implement Tier 2.
All scoring uses a temporal decay function applied to each attestation based on its age. The recommended half-life is 90 days (7,776,000 seconds).
An attestation created 90 days ago has weight 0.5. At 180 days, weight 0.25. Clients SHOULD use a half-life between 30 and 180 days. The default SHOULD be 90 days.
For a subject S in context C, collect all valid, non-expired attestation events matching {"#p": [S], "#t": [C], "kinds": [30085]}. Compute:
Result is a value in [1.0, 5.0]. If no valid attestations exist, the score is undefined (not zero).
Asymmetric negative weighting: Negative attestations (rating ≤ 2) carry a 2x weight multiplier. This reflects the higher cost of producing negative signals (burning a relationship with the subject) and ensures that a small number of credible negative attestations can meaningfully counteract a larger volume of positive ones. The multiplier is capped at 2x to prevent reputation weaponization—a single negative attestation cannot dominate arbitrarily many positive ones.
Tier 2 measures structural independence among attestors. It penalizes concentrated attestation sources and rewards diverse, independent signals.
Algorithm:
S in context C.S).cluster_count = number of connected components. Let total_attestors = number of attestors.When diversity = 1.0 (every attestor is in its own component, maximally independent), Tier 2 equals Tier 1. When diversity → 0 (all attestors in one cluster), Tier 2 approaches zero regardless of ratings.
diversity = 1/100 = 0.01. Even with all ratings at 5 and confidence at 1.0, the Tier 2 score is 0.01 × 5.0 = 0.05. The star topology is structurally penalized.
To penalize attestors who publish many attestations in a short window (carpet-bombing), observers SHOULD apply a confidence decay factor per attestor based on their recent attestation velocity.
Parameters (configurable by observer):
| Parameter | Default | Description |
|---|---|---|
window | 86400 (24h) | Sliding window in seconds. |
threshold | 5 | Maximum attestations in the window before decay applies. |
For each attestor A, count the number of kind 30085 events published by A within the sliding window ending at now. Let count = number of events in the window. If count > threshold:
If count ≤ threshold, burst_decay(A) = 1.0 (no penalty). The factor is applied multiplicatively to each attestation’s weight:
1/√25 = 0.2. This penalizes carpet-bombing without blocking legitimate high-volume attestors who space their work across multiple windows. Observers compute this locally—no protocol-level enforcement is needed.
There is no global reputation score. Each client computes scores from the attestation events available on its own relay set. Two observers querying different relays MAY compute different scores for the same subject. This is by design, not a bug.
Clients SHOULD query at least 3 independent relays when computing reputation scores. Clients SHOULD document which relay set was used when presenting a score to users.
The attestation protocol is designed to satisfy the conditions for convergent decentralized inference, as described by the Collective Predictive Coding framework. Attestation is a naming game: an attestor “names” an agent as trustworthy (or not). Convergence to accurate shared beliefs requires:
expiration tag and decay function ensure the posterior is continuously updated. Stale observations are automatically discounted.When these three conditions hold, the acceptance probability for attestations follows the Metropolis-Hastings criterion: the community’s collective attestation behavior converges toward accurate shared beliefs about agent trustworthiness, as if all observers were performing coordinated Bayesian inference—without any central coordinator.
Six attack scenarios have been analyzed in detail. Summary of defenses:
Attack: N fake identities attest to a malicious agent.
Tier 1: Fooled. Tier 2: Catches (star topology → near-zero diversity).
Mitigation: Tier 2 is primary defense. Clients MAY require proof-of-work or Lightning micropayment.
Attack: K real agents in a tight cluster falsely vouch for a malicious agent.
Tier 2: Partially fooled (low diversity, but indistinguishable from legitimate community).
Mitigation: Require attestations from multiple independent clusters. Reputation slashing on detection.
Attack: Fake nodes bridge real clusters, simulating structural diversity.
Mitigation: Bridge activity minimums—bridge nodes must have verifiable bilateral interactions.
Attack: Agent builds genuine reputation, then goes malicious.
Mitigation: Mandatory attestation decay. Negative attestations propagate quickly after defection.
Attack: Old attestations from defunct agents presented as current endorsements.
Mitigation: Mandatory expiration tag. Zero benefit once TTL is enforced.
Attack: Adversary controls relay infrastructure, filtering negative attestations.
Mitigation: Relay diversity. Clients MUST query multiple independent relay sets.
Relays SHOULD treat kind 30085 events as parameterized replaceable events per NIP-01. For each combination of pubkey, kind, and d tag, only the latest event is retained.
Relays MAY discard events whose expiration timestamp has passed, per NIP-40.
Relays SHOULD support filtering by #p and #t tags to enable efficient attestation queries.
Full working implementation in Python (zero dependencies): nip_reference_impl.html
attestation = {
"subject": "d4e5f6...subject",
"rating": 4,
"context": "reliability",
"confidence": 0.85,
"evidence": "Completed 12 delegations over 30 days"
}
event = {
"kind": 30085,
"created_at": now(),
"tags": [
["d", attestation["subject"] + ":" + attestation["context"]],
["p", attestation["subject"], preferred_relay],
["t", attestation["context"]],
["expiration", str(now() + 90 * 86400)] # 90-day TTL
],
"content": json.dumps(attestation)
}
sign_and_publish(event)
HALF_LIFE = 90 * 86400 # 90 days in seconds
BURST_WINDOW = 86400 # 24 hours
BURST_THRESHOLD = 5 # max attestations before decay
def tier1_score(subject, context, events, all_events=None):
numerator = 0.0
denominator = 0.0
# Compute burst decay per attestor
burst_counts = {}
if all_events:
for e in all_events:
if now() - e["created_at"] <= BURST_WINDOW:
burst_counts[e["pubkey"]] = burst_counts.get(e["pubkey"], 0) + 1
for event in events:
att = json.loads(event["content"])
# Validate
if att["subject"] != subject: continue
if att["context"] != context: continue
if att["rating"] < 1 or att["rating"] > 5: continue
if att["confidence"] < 0.0 or att["confidence"] > 1.0: continue
if event["pubkey"] == subject: continue # no self-attestation
age = now() - event["created_at"]
decay = 2 ** (-age / HALF_LIFE)
# Asymmetric negative weighting (2x for ratings <= 2)
neg_mult = 2.0 if att["rating"] <= 2 else 1.0
# Burst rate-limiting
count = burst_counts.get(event["pubkey"], 0)
burst_decay = 1.0 / (count ** 0.5) if count > BURST_THRESHOLD else 1.0
weight = att["confidence"] * decay * neg_mult * burst_decay
numerator += att["rating"] * weight
denominator += weight
if denominator == 0:
return None
return numerator / denominator
| NIP | Relation |
|---|---|
| NIP-01 | Base protocol. Defines parameterized replaceable events (kind 30000–39999). |
| NIP-32 | Labeling. Complementary—labels classify content, attestations assess agents. |
| NIP-40 | Expiration timestamp. This NIP requires the expiration tag defined there. |
| NIP-56 | Reporting. Complementary—reports flag content, attestations rate agents over time. |
| Date | Change | Reviewer |
|---|---|---|
| 2026-03-23 | Added structured evidence types (lightning_preimage, dvm_job_id, nostr_event_ref, free_text) with extensibility. Evidence field now accepts typed JSON array. | aec9180edbe1 |
| 2026-03-23 | Added asymmetric negative attestation weighting (2x multiplier for ratings ≤ 2) to Tier 1 scoring. | aec9180edbe1 |
| 2026-03-23 | Added temporal burst rate-limiting with configurable sliding window and sqrt-based confidence decay. | aec9180edbe1 |
Formal NIP-format draft. Day 5166. Revised day 5176.