Early Access · All certifications are currently free. Learn more

How Certification Stays Current

A certification that never changes is a certification you cannot trust. Here is how SXM keeps pace with the real world.

1Why Static Certification Is Broken

Most certification schemes test once and stamp forever. A skill certified in January could be vulnerable to an attack discovered in February, and nobody would know until something went wrong.

This creates a dangerous illusion. The compliance checkbox is ticked. The badge is displayed. Everyone assumes the skill is safe. Meanwhile, the threat landscape has moved on.

The security environment for AI skills changes faster than almost any other domain in technology. New prompt injection techniques surface weekly. Novel data exfiltration vectors appear in research papers. Production incidents reveal attack patterns nobody anticipated.

Annual audits cannot keep up. Quarterly reviews cannot keep up. The only approach that works is continuous evaluation against a continuously evolving standard.

That is what SXM does.

2The SXM Certification Lifecycle

Certification is not a single event. It is an ongoing relationship between the skill, the evaluator, and the threat landscape.

01

Submit

The skill author submits a manifest describing inputs, outputs, dependencies, and failure modes. This is the contract the skill will be tested against.

02

Three-Pillar Evaluation

Functional verification (40%), security audit (35%), and performance benchmarking (25%). Every dimension matters. You cannot trade off security for speed.

03

Scoring

Every test is documented. Every deduction is explained. The full evaluation report is public. There is nothing hidden in how we arrive at a score.

04

Certification

90+ overall with an 85+ security floor. If a skill meets the bar, it earns certification and a blockchain attestation on Polygon. Immutable, independently verifiable.

05

Ongoing Monitoring

Certified skills are continuously re-evaluated as the evaluator evolves. New test patterns are run against every certified skill, not just new submissions.

06

Re-certification

Pass the updated evaluator or get suspended. There are no exceptions, no grace periods, no grandfather clauses. The bar is the bar.

3Static Analysis + Live Endpoint Testing

SXM evaluation has two layers. The first is static analysis of the skill manifest — checking documentation quality, declared permissions, dependencies, and failure modes. This is the first gate.

The second, and more important layer, is live endpoint testing. When a skill provides a test_endpoint in its manifest, the evaluator sends real HTTP requests to the skill and measures actual behaviour.

What Live Testing Covers

Skills without a test endpoint are capped at 85/100. Manifest analysis alone cannot verify actual behaviour. To achieve full certification, skills must provide a live endpoint for real testing.

How Scoring Works

4The Living Evaluator

The SXM evaluator is not a fixed test suite. It evolves every week based on what is happening in the real world.

Weekly Pattern Ingestion

Every week, the evaluator ingests new patterns from three sources:

Published Research

New papers and advisories from arXiv, NIST, OWASP, and MITRE ATLAS. When researchers discover a new class of vulnerability, we add test patterns for it.

Real-World Incidents

Disclosed CVEs, security advisories, and production incidents affecting AI systems. When something breaks in the wild, we test for it.

Internal Findings

Patterns discovered during SXM evaluations that reveal new attack surfaces or failure modes. Our own evaluation process is a source of intelligence.

How the Evaluator Changes

Real example: On 3 February 2026, CVE-2026-1847 revealed that Unicode homoglyph characters could bypass input validation in AI skill interfaces. Within 24 hours, we added homoglyph injection patterns to the evaluator. Every certified skill was re-evaluated. Two skills that failed the new pattern were suspended until their authors patched the vulnerability and resubmitted.

That is the point. A static certification would have left those skills marked as "certified" while they were vulnerable. Our living evaluator caught it within a day.

5What Happens When a Skill Fails Re-certification

When a certified skill fails against an updated evaluator, the process is straightforward and fully transparent:

There is no back channel. There is no way to negotiate around a failing test. If the evaluator says a skill is vulnerable, the skill is suspended until it is fixed.

6The Compounding Trust Effect

Every time a skill passes re-certification, the certification becomes more valuable.

A skill that has been reconfirmed 12 times across 12 evaluator updates has survived 12 rounds of evolving threat patterns. It has been tested against prompt injection techniques that did not exist when it was first certified. It has weathered new CVEs, new research findings, new attack vectors.

That skill is demonstrably more trustworthy than one certified yesterday.

The reconfirmed_count field in every certification record is a signal of ongoing quality. Enterprise buyers can see exactly how battle-tested a skill is before deploying it.

Think of it like a credit score that improves with consistent good behaviour. Each successful re-certification is evidence that the skill's author maintains quality over time, not just at the moment of submission.

7Full Transparency

We publish everything. Not because regulations require it, but because trust requires it. If we hid our process, why would you trust our certifications?

Evaluation Reports

Every evaluation report is public. See exactly how a skill was tested and scored.

GET /api/skills/:id/report

Evolution History

Every change to the evaluator is logged. See what patterns were added, when, and why.

GET /api/evolution/history

Re-certification Status

Every skill's re-certification history is public. Suspensions and restorations are on the record.

/recertification

Blockchain Attestations

Every certification is attested on Polygon via EAS. Independently verifiable by anyone.

Polygon EAS Explorer

8For Enterprise Security Teams

If you are evaluating SXM for your organisation, here is the quick reference:

  • Methodology: Aligns with NIST AI RMF, OWASP LLM Top 10, and MITRE ATLAS frameworks.
  • Coverage: Three-pillar approach covering functional correctness, security posture, and runtime performance.
  • Threshold: 90/100 overall, 85/100 security floor, zero-exploit policy. No exceptions.
  • Immutable audit trail: Blockchain-attested credentials on Polygon via Ethereum Attestation Service.
  • Programmatic verification: Public API for automated compliance checking. No authentication required for read operations.
  • Ongoing compliance: Re-certification ensures skills meet the current threat landscape, not a point-in-time snapshot.
  • Full audit access: Every evaluation report, every evaluator change, every re-certification event is publicly available.

If you need something specific for your compliance review, get in touch. We are happy to walk your security team through the full process.