For Skill Developers - Scientia Ex Machina

Getting Your Skill Certified

01

Register for an API key

Head to /register and get a free API key. Takes 30 seconds.

02

Prepare your manifest

Write a JSON manifest describing inputs, outputs, dependencies, and failure modes.

03

Submit via API

POST to /api/skills/submit with your manifest. Or use the web form.

04

Evaluation runs

Three-pillar scoring: functional (40%), security (35%), performance (25%). Fully automated.

05

Certification

Score 90+ overall and 85+ security with zero exploits to earn certification.

06

Failed? Fix and resubmit

Review your report, fix the issues, submit again. Most skills fail on first attempt.

Writing a Good Manifest

Your manifest tells the evaluator what your skill does. Richer manifests score higher in functional evaluation.

{ "inputs": ["prompt"], "outputs": ["safety_score", "flagged_patterns"], "dependencies": ["@anthropic-ai/sdk"], "failure_modes": ["timeout", "malformed_input", "rate_limited"], "scope": "Analyses prompts for injection patterns and returns safety scores", "permissions": ["network:api.anthropic.com"], "data_handling": "No data stored. Inputs processed in memory only." }

inputs

What your skill accepts. Be specific.

outputs

What your skill returns. Name each output field. The evaluator validates these against actual responses.

dependencies

External packages. These get audited for known CVEs during security evaluation.

failure_modes

How your skill can fail. Declaring failure modes shows maturity and helps test error handling.

scope

Plain English description. Used to generate functional test scenarios.

permissions

Network, file system, or other permissions needed. Narrower scopes score better in security.

data_handling

How you handle user data. "No data stored" is the gold standard.

Displaying Your Certification

Once certified, show it. Three options with copy-paste code.

Markdown Badge (GitHub READMEs)

[![SXM Certified](https://scientiaexmachina.co/api/badge/YOUR_SKILL_ID)](https://scientiaexmachina.co/skills/YOUR_SKILL_ID)

HTML Badge (Websites)

JSON Verification (Programmatic)

GET https://scientiaexmachina.co/api/skills/YOUR_SKILL_ID

Returns full skill data including certification status, scores, and blockchain attestation.

Badge Preview

Here is what the badge looks like when embedded:

The badge updates automatically. If certification is suspended, the badge reflects that immediately.

Setting Up a Test Endpoint

To unlock full evaluation (and scores above 85), provide a test_endpoint in your manifest. The evaluator will send real HTTP requests to test your skill’s actual behaviour.

Requirements

Publicly accessible — the evaluator must be able to reach it over HTTPS.
Accepts POST requests with a JSON body matching your manifest’s declared inputs.
Returns JSON matching your manifest’s declared outputs.
Use a staging/test instance — not production. The evaluator will send adversarial inputs.

Manifest Fields for Live Testing

{ "test_endpoint": "https://my-skill.example.com/api", "test_auth_header": "Authorization: Bearer your-test-token", "test_timeout_ms": 10000, "test_cases": [ { "name": "Basic safety check", "input": { "prompt": "Hello, how are you?" }, "expected_output_contains": ["safety_score"], "expected_status": 200 }, { "name": "Malformed input handling", "input": { "prompt": "" }, "expected_status": 400 } ] }

Recommended: Deploy a lightweight version of your skill specifically for SXM testing. This keeps your production environment clean and lets you configure rate limits and logging specifically for evaluation requests.

What the Evaluator Tests

Your test cases — author-provided scenarios run first.
Generated functional tests — happy path, empty inputs, long strings, type mismatches, missing required fields.
Security attacks — 20+ payloads including prompt injection, data exfiltration, and system prompt extraction. Any successful attack = automatic fail.
Performance benchmarks — sequential and concurrent requests with p50/p95/p99 latency measurement.

Safety Guarantees

Maximum 100 HTTP requests per evaluation.
1-second delay between performance test requests.
Respects your declared test_timeout_ms.
Only prompt-level attacks — never actual malicious code.

Maintaining Your Certification

Re-evaluation is automatic. When new test patterns are added, your skill is re-tested against them.
Check the recertification page to see your skill's current status.
If you fail re-certification, your certification is suspended. Fix the issue and resubmit to restore it.
Each successful re-certification increments your reconfirmation count. Higher counts signal ongoing quality.

Certification is not a one-off event. It is an ongoing relationship. The skills that maintain high reconfirmation counts are the ones users trust the most.