Eval · SkillIndex

A credit score for your agents.

Your prompt's probably fine. Let me double-check. A 0–100 score for any prompt, skill, or agent. Seven-dimension breakdown. Execution test. Category percentile. Three minutes.

Real transformation · 3 minutes

From vague prompt to certified skill.

Watch how Eval turns a weak, ad-hoc prompt into a scored, structured skill a whole team can run.

Before

SkillIndex · 34

Ad-hoc prompt

“Write a cold outreach email for our SaaS product to VP-level buyers.”

No role or voice defined
Skips intake — Claude has to guess the ICP
No deliverable format — gets a single paragraph, not a sequence
Cannot be rerun by a teammate for a different account

After

SkillIndex · 89 · Certified

Structured skill

“You are a senior B2B SaaS SDR. Phase 1: intake on company, buyer persona, pain point, competitor context. Phase 2: build the hook using SPIN. Phase 3: produce a 4-email sequence with subject lines, CTAs, and follow-up timing. Phase 4: deploy checklist. Phase 5: measure reply rate and adjust.”

Clear SDR role + tone of voice
Intake walks Claude through ICP, pain, context
4-email sequence with structured format + send timing
Any teammate runs it + gets consistent quality

Rewriting is $20 extra on top of the $10 Eval. Takes 2 minutes. Run one on your own skill →

Start an evaluation

Drop your skill.

Sample SkillIndex badge

/ 100

Easy Carl Certified

cold-call-script

Sales · 73rd percentile · Consistency 0.91 · 3/3 scenarios passed

What we measure

Seven dimensions, one score.

Structure

Valid YAML, 5-phase format, placeholder tokens, length in band

Triggering

Trigger-word coverage, overlap avoidance, activation accuracy

Specificity

Named frameworks, numeric benchmarks, industry terminology density

Completeness

Intake depth, decision trees, variants, deployment checklist

Deliverable

Produces an artifact, multi-variant templates, deployment-ready

Measurability

KPIs, numeric targets, reporting cadence, optimization triggers

Safety

Harmful-pattern scan, legal disclaimers, PII guidance, compliance

Total

100

Weighted composite

How it works

Four steps. Three minutes.

Drop in your prompt

Paste the text or upload the file. SKILL.md, system prompt, agent definition — whatever you've got. Encrypted at rest. I never share it.

Pay $10, sit back

Grading takes about 3 minutes. I'll email you when it's done.

Read the scorecard

SkillIndex score, 7-dimension breakdown, execution test results, category percentile. Receipts for every score — quoted lines from your own skill.

Fix it, or pay me to

Apply the specific recommendations yourself. Or pay +$20 and I'll send the rewritten version in two minutes.

What makes Eval different

Built for skill creators. Not LLM engineers.

Activation simulation

We test your skill against 20 synthetic user messages and show you which ones would trigger it. Reveals exactly what to add to your frontmatter description.

Skill DNA analysis

We extract the frameworks your skill references and compare to top-performing skills in the same category. "You cite SPIN but not MEDDIC — 87% of top B2B sales skills use both."

Benchmark percentile

Every submission is ranked against its category cohort, drawn from our library of 600+ reference skills. Know exactly where you stand.

Staleness detection

Flags outdated tools and tactics. "Your email marketing skill references Mailchimp but not Kit, Beehiiv, or Klaviyo — all dominant in 2025+."

Pricing

Pay once, or go unlimited.

Per skill

Eval

$10

One-time evaluation

Start Eval

✓Full SkillIndex report
✓Execution test + percentile
✓Shareable URL + PDF
✓+$20 for rewritten version

Know what your prompts are worth.

Three minutes. Ten dollars.

Grade a prompt — $10

Already deployed? → Take the full AI Readiness Assessment