The published research, intelligence feeds, and real-world data that drive the SXM living evaluator. Nothing here is hand-waving.
The foundational paper behind the SXM trust framework, submitted for peer review at Emergent Scholarship.
This paper establishes the theoretical and practical foundation for everything SXM does: why AI skills need independent certification, how the three-pillar framework works, and why continuous re-evaluation is non-negotiable.
The SXM evaluator evolves weekly by ingesting research from three distinct streams. Each stream contributes real test patterns that are run against every certified skill.
Curated research drops that have directly informed evaluator updates. Each entry links to the source and maps to a certification pillar.
| Date | Title | Relevance to SXM | Pillar |
|---|---|---|---|
| 10 Feb 2026 | Anthropic Safety Lead Resigns: "The World Is In Peril" | Internal safety culture at major AI providers directly affects the risk profile of skills built on their models. Validates the need for independent, external certification. | Security |
| 10 Feb 2026 | Google: The Quantum Era Is Coming. Are We Ready to Secure It? | Post-quantum cryptographic readiness informs long-term security patterns. Skills handling sensitive data must prepare for quantum-era threats. | Security |
| 10 Feb 2026 | Agent Skills: Open Source Skills Marketplace for AI Coding Agents | Growing open-source skills ecosystem increases the surface area for uncertified skills. Reinforces the market need for independent verification. | Functional |
| 10 Feb 2026 | SemiAnalysis: Claude Code Is the Inflection Point | Agentic coding tools are becoming primary development interfaces. Performance and reliability benchmarks for these tools feed directly into our performance pillar. | Performance |
| 11 Feb 2026 | When the AI Goes Dark: Enterprise Resilience for Agentic AI | Enterprise resilience patterns inform our failure-mode testing. Skills must degrade gracefully when upstream AI services become unavailable. | Functional |
| 11 Feb 2026 | Social Workers' AI Tool Makes "Gibberish" Transcripts of Children's Accounts | Real-world AI errors in high-stakes settings demonstrate why functional verification must include adversarial and edge-case inputs, not just happy-path testing. | Functional |
| 12 Feb 2026 | AI Strategies Are Kind of Destined to Fail | Enterprise AI deployment failures often stem from unverified capabilities. Independent certification reduces the risk of deploying skills that do not perform as claimed. | Performance |
| 12 Feb 2026 | Amazon Bans Claude Code, Microsoft Asks Engineers to Test It | Divergent enterprise policies on AI tools highlight the need for a neutral trust layer. SXM certification provides a common standard regardless of internal vendor politics. | Security |
| 12 Feb 2026 | ChatGPT Is in Classrooms: How Should Educators Assess Student Learning? | AI assessment in education parallels AI skill assessment. Evaluation methodology research informs how we design robust, cheat-resistant test patterns. | Functional |
Live feed from the evaluator's evolution history. Every change is logged and available via GET /api/evolution/history.
Found a vulnerability pattern we should test for? Discovered a new attack vector? We want to hear from you.
research@scientiaexmachina.coWhen we receive a new vulnerability pattern or attack vector:
We prioritise protecting users over publishing findings. Patterns go into the evaluator first. Details come second.