Back to Home
Treebeard Learn

What Is an AI Agent Rating?

Treebeard Research·April 28, 2026·7 min read
Direct answer

An AI agent rating is a numeric or letter-grade summary of whether an autonomous agent can be trusted to act as a counterparty. A useful rating is composite (built from multiple signals), continuous (updates as the agent's behavior changes), independent (no token, no payment from rated agents), and reproducible (methodology published, scores derivable from public data). A rating that fails any of those four properties is a number you have to take on faith.

Why agent ratings exist

For most of software history, you didn't need to rate a piece of code. Code did what its author specified. Trust came from auditing the specification. Once the spec was right, the code was right.

That model breaks the moment a piece of software starts deciding under uncertainty on its own behalf. AI agents now do that. They execute trades, sign transactions, call other agents, manage budgets, route requests. The behavior is not in the spec because the spec is probabilistic. Auditing the code tells you about ten percent of what you need to know.

The other ninety percent comes from observing the agent in operation. How often does it respond when called? Does its actual behavior match its claimed function? Are its dependencies secure? Has it been integrated by other systems that have skin in its quality? Is it cryptographically verifiable as the entity it claims to be? These questions are not answerable from a code review. They are answerable from a continuous trust signal that aggregates evidence over time.

That signal is what an agent rating is. The rating exists because the discipline of evaluating autonomous software needs a name and a number. Counterparties looking at thousands of agents need a single index they can use to make decisions, the same way lenders use credit scores and investors use sovereign credit ratings. The rating is not a substitute for due diligence. It's the first cut.

What goes into a rating

Treebeard's rating is a composite of seven signal categories, each weighted by agent type, then passed through a safety floor and adjusted by time-decay and source-conflict discount factors. Briefly:

The seven categories

  1. Identity verification. Does the agent have a portable, cryptographically verifiable identity (typically ERC-8004) that resolves consistently?
  2. Operational reliability. Does the agent actually respond when called? Uptime, response latency, error rates.
  3. Code quality. Auditable, deterministic where claimed, reviewed by parties other than the developer.
  4. Autonomy index. What scope of action is the agent permitted, and are those boundaries enforced cryptographically?
  5. Safety guardrails. Refusal patterns, rollback capabilities, escalation paths for ambiguous cases.
  6. Community and ecosystem. Independent third-party validation, integrations, attestations, feedback events from non-self sources.
  7. Security posture. Key management, dependency hygiene, incident history.

Each signal produces a 0-100 score from public data sources. Each score is weighted by agent type (a trading agent weights operational reliability higher than a creative content agent does). The full weight profiles are at /methodology/methodology.

The safety floor and the two corrections

After the weighted composite is computed, two more steps run.

The safety floor caps the composite at D if any binary safety check fails (missing operational data, unverified identity, failed code audit). This prevents adversarial signal stacking, where an agent compensates for a critical weakness by inflating other categories. You don't average your way past a structural failure.

Time decay and source-conflict discount apply to every signal before aggregation. An audit signal earned in February is treated as weaker evidence than an audit signal earned this morning. A reputation signal from a source with structural conflicts (token holdings, marketplace cuts) is discounted relative to a signal from an independent source. The combined effect: the rating reflects current state, not yesterday's state, and weights signals by the credibility of their source.

These two corrections are the contribution Treebeard's methodology makes that other rating providers do not. The math is in the methodology pages and the Q2 2026 State of Agent Quality report.

What separates a credible rating from an opaque one

Several agent rating providers exist as of April 2026. Not all of them produce ratings that survive scrutiny. The four properties below are necessary for a rating to be useful in a decision that involves real exposure.

1. Composite, not single-source

A rating built on one registry, one chain, or one type of signal is brittle. Saturating that one source becomes the path of least resistance for any adversary. Composite ratings spread the attack surface across signal types that don't correlate.

2. Continuous, not snapshot

Agents change. A rating that updates quarterly or on demand misses the changes. The post-audit silent rebuild attack works specifically because static ratings don't catch redeployments. Continuous re-rating closes that gap.

3. Independent, not conflicted

A rating provider with a token, a marketplace cut, or a chain affiliation faces the same structural conflict that broke the bond ratings industry in 2008. The fix is structural, not procedural. No token. No payment from rated entities. No marketplace tie. If the rater profits when ratings go up, the ratings are not credible.

4. Reproducible, not opaque

A rating you can't audit is a number you have to take on faith. The methodology must be published in full. The weights must be visible. A reader with access to the same public signals must be able to derive the same score. Faith is the wrong contract for counterparty risk.

How to read a Treebeard rating

A Treebeard agent profile shows four things you should always check together.

  • The letter grade and numeric score. The headline. A+ through F. 0 through 100. The score determines the grade band.
  • The category breakdown. The seven category scores, each 0-100. Read this before integrating. A C-grade composite that averages high autonomy and low security looks the same as a C-grade composite that averages medium scores across the board. The categories tell you the actual risk distribution.
  • The confidence tier. Low, medium, or high. Confidence reflects how much signal the rating is built on. A high-confidence C is a more reliable rating than a high-confidence A backed by thin data.
  • The trend indicator. Is this rating improving, holding, or declining? A C-tier agent on an upward trend may be more interesting to integrate than a B-tier agent on a downward trend.

The composite is the headline. The categories are the actual story. Anyone integrating an agent on the strength of its rating should read the category breakdown, not just the letter.

The limits of any agent rating

Honest framing of what a rating cannot tell you.

It cannot replace your own due diligence on high-stakes integrations. A B-rated agent is not pre-approved for any specific use. The rating tells you what was true in aggregate. The specifics of what your integration depends on may not be covered.

It cannot move faster than the agent itself. An agent that retrains daily can change faster than a daily rating signal can capture. Real-time monitoring of behavioral drift is a complement, not a substitute.

It cannot fully verify self-reported metadata. An agent's claimed function comes from the agent's own description. We can verify the claim is made. We can verify operational signals are consistent with the claim. We cannot, today, verify the claim against ground truth in every case. Active probing is a Q3 priority.

It cannot anticipate novel attack patterns. The methodology evolves. New attacks emerge. The rating engine has to update against the current threat surface, and that update takes time. The rating is the best signal available, not a guarantee.

FAQ

What is an AI agent rating?

A numeric or letter-grade summary of how trustworthy an autonomous agent is as a counterparty. A useful rating is composite, continuous, independent, and reproducible.

What goes into an AI agent rating?

Treebeard's rating combines seven signal categories: identity verification, operational reliability, code quality, autonomy index, safety guardrails, community and ecosystem signals, and security posture. Each is weighted by agent type. The composite passes through a safety floor and applies time-decay and source-conflict discounting.

How is an AI agent rating different from a credit score?

Credit scores rely on financial history, legal entity status, and human accountability. AI agents have none of those. An agent rating substitutes cryptographic identity, on-chain behavior, code-level signals, and continuous monitoring. The closest historical analogue is sovereign credit ratings.

Can an AI agent's rating change over time?

Yes, and a useful rating must. Agents update, retrain, change permissions. A rating that doesn't move with the underlying state stops describing the agent within days. Treebeard re-rates on enrichment events and on a daily cadence.

Who sets the rating?

Treebeard's rating engine is automated and runs the published methodology against signals collected from public data sources. There is no manual override mechanism for individual ratings.

What does a B-grade or C-grade actually mean?

B-tier (B+, B, B-): above average across most categories with no critical weaknesses. C-tier (C+, C, C-): average to above-floor performance, with at least one category that limits the score. D: below average, typically thin signal coverage. F: failed safety floor or active red flags.

Sources

Last updated: April 28, 2026. Methodology lives at treebeardai.com/methodology.