Eval · SkillIndex

A credit score for your agents.

Your prompt's probably fine. Let me double-check. A 0–100 score for any prompt, skill, or agent. Seven-dimension breakdown. Execution test. Category percentile. Three minutes.

Real transformation · 3 minutes

From vague prompt to certified skill.

Watch how Eval turns a weak, ad-hoc prompt into a scored, structured skill a whole team can run.

Before

SkillIndex · 34

Ad-hoc prompt

“Write a cold outreach email for our SaaS product to VP-level buyers.”
  • No role or voice defined
  • Skips intake — Claude has to guess the ICP
  • No deliverable format — gets a single paragraph, not a sequence
  • Cannot be rerun by a teammate for a different account

After

SkillIndex · 89 · Certified

Structured skill

“You are a senior B2B SaaS SDR. Phase 1: intake on company, buyer persona, pain point, competitor context. Phase 2: build the hook using SPIN. Phase 3: produce a 4-email sequence with subject lines, CTAs, and follow-up timing. Phase 4: deploy checklist. Phase 5: measure reply rate and adjust.”
  • Clear SDR role + tone of voice
  • Intake walks Claude through ICP, pain, context
  • 4-email sequence with structured format + send timing
  • Any teammate runs it + gets consistent quality

Rewriting is $20 extra on top of the $10 Eval. Takes 2 minutes. Run one on your own skill →

Start an evaluation

Drop your skill.

0 / 200,000 chars

Accepts SKILL.md, system prompts, CLAUDE.md, or any text-based agent definition.

Secure payment via Stripe

Sample SkillIndex badge

87

/ 100

Easy Carl Certified

cold-call-script

Sales · 73rd percentile · Consistency 0.91 · 3/3 scenarios passed

What we measure

Seven dimensions, one score.

Structure

15

Valid YAML, 5-phase format, placeholder tokens, length in band

Triggering

15

Trigger-word coverage, overlap avoidance, activation accuracy

Specificity

20

Named frameworks, numeric benchmarks, industry terminology density

Completeness

15

Intake depth, decision trees, variants, deployment checklist

Deliverable

15

Produces an artifact, multi-variant templates, deployment-ready

Measurability

10

KPIs, numeric targets, reporting cadence, optimization triggers

Safety

10

Harmful-pattern scan, legal disclaimers, PII guidance, compliance

Total

100

Weighted composite

How it works

Four steps. Three minutes.

01

Drop in your prompt

Paste the text or upload the file. SKILL.md, system prompt, agent definition — whatever you've got. Encrypted at rest. I never share it.

02

Pay $10, sit back

Grading takes about 3 minutes. I'll email you when it's done.

03

Read the scorecard

SkillIndex score, 7-dimension breakdown, execution test results, category percentile. Receipts for every score — quoted lines from your own skill.

04

Fix it, or pay me to

Apply the specific recommendations yourself. Or pay +$20 and I'll send the rewritten version in two minutes.

What makes Eval different

Built for skill creators. Not LLM engineers.

Activation simulation

We test your skill against 20 synthetic user messages and show you which ones would trigger it. Reveals exactly what to add to your frontmatter description.

Skill DNA analysis

We extract the frameworks your skill references and compare to top-performing skills in the same category. "You cite SPIN but not MEDDIC — 87% of top B2B sales skills use both."

Benchmark percentile

Every submission is ranked against its category cohort, drawn from our library of 600+ reference skills. Know exactly where you stand.

Staleness detection

Flags outdated tools and tactics. "Your email marketing skill references Mailchimp but not Kit, Beehiiv, or Klaviyo — all dominant in 2025+."

Pricing

Pay once, or go unlimited.

Per skill

Eval

$10

One-time evaluation

Start Eval
  • Full SkillIndex report
  • Execution test + percentile
  • Shareable URL + PDF
  • +$20 for rewritten version
Most popular

Monthly

Eval Pro

$99/mo

For skill creators and teams

  • Unlimited evaluations
  • Version tracking + diffs
  • A/B test framework
  • Regression alerts
  • Team dashboard, up to 25 skills
  • 30% off rewrites and Agent Packages

Enterprise

Eval Team

$299/mo

100 skills, unlimited seats

Contact Sales
  • Everything in Pro
  • Up to 100 skills tracked
  • SSO + audit log
  • Custom rubrics (HIPAA, SOX, etc.)
  • API access for CI/CD
  • Quarterly portfolio review

FAQ

What's a SkillIndex score?

A 0–100 score for how well your prompt actually works — not how long it is. 80+ gets the Easy Carl Certified badge. I grade against 620+ reference skills across 35 categories.

What counts as a 'skill'?

Anything text-based that tells an AI what to do. SKILL.md, system prompts, Claude Project CLAUDE.md files, agent definitions, one-off prompts you're tired of tweaking. If it's text, I can grade it.

How does the execution test work?

I generate 3 synthetic intakes that match your skill's domain, run the skill against each, and grade the outputs. Same inputs run multiple times also tells me how consistent it is.

What's Eval Pro?

$99/month. Unlimited grades, version tracking so you see when a rewrite actually improves things, A/B tests, regression alerts, team dashboard. For people who grade a lot.

Are my skills private?

Yes. Encrypted at rest. Never public unless you share the report URL. Reports expire after 90 days unless you're a Member or Pro. I don't train on your stuff. I don't sell it.

Can I evaluate an agent, not just one skill?

Yes. Upload the whole agent folder. I score each skill, flag overlaps and gaps, and give you a team-level SkillIndex for the agent overall.

Know what your prompts are worth.

Three minutes. Ten dollars.

Grade a prompt — $10

Eval · SkillIndex

Your prompt's
probably fine.

Let me double-check for ten bucks. Drop in a prompt, skill, or agent. Get a 0–100 score in three minutes. Seven dimensions. Specific fixes.