An independent rating agency for autonomous software tools

Sagentum assesses MCP servers — the tools AI agents call in autonomous loops — across 8 dimensions specifically designed for production agent use. Every score is evidence-based, every methodology decision is published, and the assessment process is structurally independent of the servers being assessed.

Why this doesn't exist yet

Every existing MCP registry has a structural conflict of interest. The platforms that host MCP servers profit from those servers being used — which means their quality signals, however well-intentioned, are shaped by the same financial relationships that compromise credit rating agencies and audit firms.

Smithery — the most developer-complete MCP platform — concluded they couldn't credibly assess server quality while also profiting from hosting those servers. So they dropped quality scoring and pivoted to infrastructure. The platform best positioned to solve the quality problem couldn't do it for structural reasons.

The official MCP Registry is governed by committee across Anthropic, GitHub, Microsoft, and the Linux Foundation. Opinionated quality scoring creates political problems it cannot absorb. It is infrastructure plumbing — a canonical index other tools build on — not an assessor.

Sagentum was built to be the thing none of them can be: an independent assessor with no financial relationship with any server it scores.

What the assessment actually covers

MCP servers are not generic APIs. They are tools that AI agents call in autonomous loops, often without human oversight between calls. The failure modes are different:

  • An agent calling a server with an undocumented destructive side effect may irreversibly delete data.
  • An agent that cannot parse a server's error response may retry indefinitely, cascading failures across a workflow.
  • A server whose tool descriptions are written for humans causes the model to misuse or mis-select the tool.
  • A server leaking credentials in response headers compromises the agent's entire environment.

The 8-dimension standard was built to surface exactly these failure modes. The methodology is published in full at /methodology — not as a summary, but as the complete scoring criteria, formula, and evidence requirements used for every assessment.

The independence model

Independence is not a marketing claim at Sagentum. It is a documented structural constraint:

Sagentum does not host MCP servers.
Sagentum does not charge for placement that affects scores.
Sagentum does not accept investment from MCP server hosts, MCP registries, or AI model providers.
Vendor certification fees are charged for the assessment process — documentation review, live testing, formal report — regardless of outcome. Vendors pay whether they pass or fail. The fee pays for the work of assessment, not for a favourable result.

The capital structure constraint — no investment from conflicted parties — is decided permanently, not case by case. Rating agencies lose their independence precisely when they begin making exceptions to this rule under financial pressure. The constraint is documented here, not promised verbally.

Cross-model neutrality

Sagentum's assessment criteria make no assumption about which AI model calls the server. The live test suite is model-agnostic — it calls MCP servers directly via the protocol, not through any particular model's API. A score means the same thing whether you are building on Claude, GPT-4o, Gemini, or any other model.

This is the specific defence against assessment being internalised by model providers. A trust score built by Anthropic is, by definition, biased toward servers that work well with Claude. A score built by OpenAI is biased toward the GPT ecosystem. Cross-model neutrality is something no model provider can credibly offer about their own scoring system. It is Sagentum's structural moat against that threat.

Who runs Sagentum

Sagentum is a solo project, currently operating as a sole trader based in the UK. The assessment pipeline is automated — static analysis, live endpoint testing, content generation — with human review for every record before publication.

The pipeline source is not public, but the methodology, scoring rubric, prompt structure, and evidence requirements are fully published. Any developer can read the criteria and verify whether a published assessment correctly applies them. The dispute process exists precisely for cases where it does not.

Sagentum will incorporate as a UK Ltd company before charging any business for a formal certification assessment. The legal entity formation is a credibility and liability decision, not a growth milestone — enterprises and professional developers expect to pay a legal entity, not an individual.

The long-term thesis

As the MCP ecosystem matures, the problem shifts from discovery to evaluation. When 20,000 servers exist and 50 of them claim to execute Python in a sandbox, the question is not which ones exist — it is which ones are safe to run in an autonomous agent loop.

The analogy is to financial ratings agencies before structured finance, or SOC2 before enterprise SaaS procurement. The standard did not exist until someone built it. Once it existed, enterprise buyers required it. Sagentum is the early attempt at that standard for agentic AI tool selection.

The most valuable thing being built here is not the scores, the UI, or the pipeline. It is the time-indexed record of tool behaviour. A competitor starting in Year 3 cannot go back and assess how servers behaved in Year 1. The longitudinal database is the irreplaceable asset — and it is built by solving current problems well.

Contact and disputes

Score disputes require specific counter-evidence — a quote from documentation that contradicts the assessment, or a test result showing different behaviour. Valid disputes are reviewed within 48 hours and resolved with a public changelog note. The dispute process is at /methodology.

Server developers who want to opt out of live endpoint testing can email testing-opt-out@sagentum.com. Opting out is not penalised in the score — it results in a Documentation + Static Analysis assessment with a lower score ceiling.

For everything else: hello@sagentum.com