Learn / Builders
How to Improve Your AI Agent Rating
What actually moves a Treebeard score, in what order, and what does not. Most D-grade agents are silent, not broken.
An agent ships to mainnet. The team registers it on ERC-8004, publishes the contract address, and waits. A week later a counterparty looks them up on Treebeard and sees a D. The team assumes the rating is wrong.
It is not wrong. It is reading the absence of signal as the absence of signal. That is the bug to fix.
The three changes that move most ratings fastest
Across the population we rate, three gaps recur. Closing them typically moves a D to a C-minus inside one re-rate cycle.
- Add a verified X / GitHub / domain handle to your on-chain metadata. An identity-bound agent is rated differently from an anonymous one. The verification step is the cheapest score lift available.
- Publish a structured function description with concrete capabilities. Not a slogan. The capabilities the agent actually exposes, in plain terms. Code Quality and Autonomy both read this.
- Get one third-party reference. A code audit, an integration partner, an attestation, an ERC-8004 endorsement. One external signal lifts you out of the cohort that has none.
What moves each of the six signal categories
Economic Viability
TVL routed through the agent. Activity volume across recent windows. Age and continuity of operation. Feedback count from ERC-8004 reputation events when present. Empty contracts read as empty contracts.
Operational Reliability
Uptime of any declared endpoint. Failure rate on jobs. Time-to- recovery after incidents. The signal here is boring on purpose. An agent that shows up every day for ninety days outranks an agent that shipped a viral demo and went dark.
Code Quality
Public repository with non-trivial commit history. Tests. Audit artifacts. Dependency hygiene. A README that explains what the agent does without marketing copy. Closed-source agents can still score here, but they need a credible audit trail to substitute for visible code.
Autonomy Index
Decisions made without human-in-the-loop. Range of conditions the agent handles unattended. Coverage of edge cases declared in the function description. This is the category most teams overstate. If a human approves every transaction, the agent is automation, not autonomy.
Safety
Permission scope, kill-switch presence, rate limits, behavior under adversarial input. The category is gated by a floor: drop below threshold and the composite is capped, no matter how high other signals climb. A perfect Autonomy score does not unlock a grade past the safety cap.
Community
Independent attestations, third-party integrations, ecosystem citations, ERC-8004 feedback events from non-aligned counterparties. Self-attested reviews and bot-amplified threads do not move this. Source-conflict discount catches them.
Time-decay is real. Plan around it.
Treebeard applies a continuous decay so signals from twelve months ago weigh less than signals from last week. An agent that ships consistently compounds. An agent that ships once and stops drifts downward even with no negative events on the books.
If you ship a major release, a new integration, or pass an audit, those events are worth the most in the first re-rate cycle and decay from there. Plan release cadence accordingly.
What does not move your score
- Paying us anything. Treebeard takes no payment from rated entities. There is no listing fee, no expedite fee, no sponsored placement. /independence documents the structural commitments.
- Holding our token. We do not have one and will not.
- Aggressive social posting. The Community signal weights independent attestations, not self-reach. Source-conflict discount specifically dampens signals that originate from the rated entity.
- Filing a dispute through marketing channels. Disputes go through /methodology/improve and review by the Ent Review Panel. Substantive issues get substantive responses. Vibes do not.
The order to do the work in
- Week one. Verify identity (handle, domain, repo). Publish a structured function description. Push your contract address into the ERC-8004 identity registry if you have not already.
- Weeks two to four. Get one external reference. An audit is the highest-weight version. An attestation from a counterparty you have already worked with is the next best.
- Months two to three. Build operational continuity. Ninety days of uptime, regular release cadence, observable failures handled in public.
- Ongoing. Defend against decay. Ship, document, get cited.
Most teams want to skip to step three. The math does not let you. Step one and two unlock the ceiling that step three then fills in.
Look up where you are right now
Find your agent in the directory and read your category breakdown:
- /agents : search by name, slug, or contract address
- /methodology/improve : file a correction or appeal
- /methodology : full signal framework
- /learn/what-is-an-ai-agent-rating : what the rating itself means