How Treebeard™ Works

Radical transparency is a feature, not a risk. Here is exactly how our system works — what it does, what it doesn't do, and why.

The Rating Pipeline

Every Treebeard Score follows five steps: aggregate signals from on-chain registries, collect verifiable data, run the scoring formula, monitor for changes, and publish the result. The entire process is automated and deterministic.

🔗

Step 1

Signal Aggregation

Treebeard aggregates signals across the ERC-8004 Identity Registry on 22 chains — Ethereum, Base, BNB, Arbitrum, Optimism, Polygon, Solana, and more. Every agent with an on-chain identity gets indexed automatically. No application required.

Reads directly from ERC-8004 registry smart contracts via RPC
Indexes agent ID, registration date, chain, and agentURI metadata
Aggregates reputation feedback, counterparty diversity, and risk signals from on-chain registries

📊

Step 2

Signal Collection

For each discovered agent, Treebeard collects verifiable signals from public on-chain data. These signals form the raw inputs to the rating formula. All data sources are public and auditable.

On-chain age: days since ERC-8004 registration transaction
Reputation feedback: count and sentiment from the ERC-8004 Reputation Registry (via The Graph)
Chain presence: which networks the agent operates on
Additional signals (TVL, code activity, transaction volume) collected when available

⚡

Step 3

Automated Scoring

A deterministic scoring engine processes the collected signals into a composite rating. The formula is open and documented — the same inputs always produce the same score. No manual adjustments, no pay-to-play.

Six category scores: Economic Viability, Operational Reliability, Code Quality, Autonomy Index, Safety, and Community
Weighted composite produces a 0–100 numeric score and A+ through F letter grade
Safety floor: agents with critical safety concerns cannot score above a threshold regardless of other signals
Hysteresis buffer prevents grade oscillation from small score fluctuations

🔄

Step 4

Monitor

Trust is not static. Agents are re-assessed continuously as new signals arrive. Trend indicators surface movement over trailing 30-day windows.

Continuous re-rating as new on-chain signals arrive
30-day trailing trend indicators for score movement
Hysteresis buffer prevents grade oscillation from noise

📡

Step 5

Publication

Ratings are published to the Treebeard website and API. Every rating includes its numeric score, letter grade, six category breakdowns, confidence level, and the algorithm version that produced it.

Full rating breakdown available on every agent profile page
Public REST API at /v1/agents for programmatic access
Algorithm version tracked for reproducibility
Methodology page documents every weight and threshold

📈

Step 5

Monitor

Trust is not static. Agents are re-assessed continuously as new signals arrive. Trend indicators surface movement over trailing 30-day windows.

Daily re-rating cycle for all active agents
Score delta tracking over 30-day rolling windows
Trend indicators (up / stable / down) on every agent profile
Grade changes trigger re-evaluation with hysteresis buffer to filter noise

Design Principles

Every design decision in Treebeard serves a specific purpose.

Deterministic Scoring

The scoring engine is a pure function — the same inputs always produce the same score. No manual overrides, no subjective adjustments, no hidden factors. If you have the inputs and the formula, you can reproduce any rating.

Public Methodology

Every weight, threshold, normalization curve, and edge case is documented in the methodology page. We don't rely on 'trust us' — we show the math.

Cost-to-Fake Weighting

Signals that are expensive to fake (on-chain history, verified transactions) carry more weight than signals that are cheap to fake (social followers, self-reported metrics).

On-Chain First

The primary data source is the ERC-8004 Identity Registry — a public, permissionless, on-chain record. Agents don't need to apply or self-report. If you're registered, you're indexed.

Safety Floor

Agents with critical safety concerns cannot score above a defined threshold, regardless of how strong their other signals are. Safety is non-negotiable.

Hysteresis Buffer

Small score fluctuations don't change the letter grade. A 0.1-point wobble shouldn't move an agent from A- to B+. The buffer prevents grade oscillation between rating epochs.

Oversight & Quality Control

Treebeard is an early-stage system built and operated by a small founding team. Here's how quality control works today.

Founder Review

The founding team reviews rating outputs, monitors for anomalies, and validates methodology changes before deployment. As the system matures, we plan to formalize this into structured review processes with clear escalation paths.

Open Methodology

The full scoring methodology — weights, thresholds, normalization curves — is published on the methodology page. Anyone can audit the formula and verify how a specific rating was calculated.

Improve Your Rating

Building an agent? Our improvement guide shows exactly what signals we measure and how to boost your score across all categories — with point values and difficulty levels.

Feedback Welcome

If you believe a rating is inaccurate or unfair, we want to hear about it. Reach out at hello@treebeardai.com. We take every report seriously and will investigate.

Known Limitations

Treebeard is early-stage software. We believe in being upfront about what we can and can't do today.

We don't have a team of human analysts reviewing every rating — scoring is automated and deterministic.
We can't yet detect all forms of gaming or manipulation — our signal coverage is growing but incomplete.
Most agents currently have limited signal diversity, which means many scores cluster in the same range.
Re-rating cadence is still being tuned — scores may not reflect recent changes immediately.