Sagentum assesses MCP servers — the tools AI agents call in autonomous loops — across 8 dimensions specifically designed for production agent use. Every score is evidence-based, every methodology decision is published, and the assessment process is structurally independent of the servers being assessed.
Every existing MCP registry has a structural conflict of interest. The platforms that host MCP servers profit from those servers being used — which means their quality signals, however well-intentioned, are shaped by the same financial relationships that compromise credit rating agencies and audit firms.
Smithery — the most developer-complete MCP platform — concluded they couldn't credibly assess server quality while also profiting from hosting those servers. So they dropped quality scoring and pivoted to infrastructure. The platform best positioned to solve the quality problem couldn't do it for structural reasons.
The official MCP Registry is governed by committee across Anthropic, GitHub, Microsoft, and the Linux Foundation. Opinionated quality scoring creates political problems it cannot absorb. It is infrastructure plumbing — a canonical index other tools build on — not an assessor.
Sagentum was built to be the thing none of them can be: an independent assessor with no financial relationship with any server it scores.
MCP servers are not generic APIs. They are tools that AI agents call in autonomous loops, often without human oversight between calls. The failure modes are different:
The 8-dimension standard was built to surface exactly these failure modes. The methodology is published in full at /methodology — not as a summary, but as the complete scoring criteria, formula, and evidence requirements used for every assessment.
Independence is not a marketing claim at Sagentum. It is a documented structural constraint:
The capital structure constraint — no investment from conflicted parties — is decided permanently, not case by case. Rating agencies lose their independence precisely when they begin making exceptions to this rule under financial pressure. The constraint is documented here, not promised verbally.
Sagentum's assessment criteria make no assumption about which AI model calls the server. The live test suite is model-agnostic — it calls MCP servers directly via the protocol, not through any particular model's API. A score means the same thing whether you are building on Claude, GPT-4o, Gemini, or any other model.
This is the specific defence against assessment being internalised by model providers. A trust score built by Anthropic is, by definition, biased toward servers that work well with Claude. A score built by OpenAI is biased toward the GPT ecosystem. Cross-model neutrality is something no model provider can credibly offer about their own scoring system. It is Sagentum's structural moat against that threat.
Sagentum is a solo project, currently operating as a sole trader based in the UK. The assessment pipeline is automated — static analysis, live endpoint testing, content generation — with human review for every record before publication.
The pipeline source is not public, but the methodology, scoring rubric, prompt structure, and evidence requirements are fully published. Any developer can read the criteria and verify whether a published assessment correctly applies them. The dispute process exists precisely for cases where it does not.
Sagentum will incorporate as a UK Ltd company before charging any business for a formal certification assessment. The legal entity formation is a credibility and liability decision, not a growth milestone — enterprises and professional developers expect to pay a legal entity, not an individual.
As the MCP ecosystem matures, the problem shifts from discovery to evaluation. When 20,000 servers exist and 50 of them claim to execute Python in a sandbox, the question is not which ones exist — it is which ones are safe to run in an autonomous agent loop.
The analogy is to financial ratings agencies before structured finance, or SOC2 before enterprise SaaS procurement. The standard did not exist until someone built it. Once it existed, enterprise buyers required it. Sagentum is the early attempt at that standard for agentic AI tool selection.
The most valuable thing being built here is not the scores, the UI, or the pipeline. It is the time-indexed record of tool behaviour. A competitor starting in Year 3 cannot go back and assess how servers behaved in Year 1. The longitudinal database is the irreplaceable asset — and it is built by solving current problems well.
Score disputes require specific counter-evidence — a quote from documentation that contradicts the assessment, or a test result showing different behaviour. Valid disputes are reviewed within 48 hours and resolved with a public changelog note. The dispute process is at /methodology.
Server developers who want to opt out of live endpoint testing can email testing-opt-out@sagentum.com. Opting out is not penalised in the score — it results in a Documentation + Static Analysis assessment with a lower score ceiling.
For everything else: hello@sagentum.com