🔗 View on GitHub: anthropics/skills/skill-creator
🚀 Quick Installation
Create new skills, modify and improve existing skills, and measure skill performance.
Claude Code Plugin
/plugin marketplace add anthropics/skills
/plugin install example-skills@anthropic-agent-skillsWhat It Does
A comprehensive skill for creating, improving, and evaluating Agent Skills. Use this when you want to:
- ✨ Create a skill from scratch
- 🔧 Edit or optimize an existing skill
- 📊 Run evals to test a skill
- 📈 Benchmark skill performance with variance analysis
- 🎯 Optimize a skill's description for better triggering accuracy
The Skill Creation Process
- Decide — What should the skill do and how should it do it?
- Draft — Write the initial SKILL.md
- Test — Create test prompts and run Claude with the skill
- Evaluate — Review results qualitatively and quantitatively
- Iterate — Rewrite based on feedback
- Scale — Expand the test set and try again at larger scale
Key Features
Skill Structure
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter (name, description required)
│ └── Markdown instructions
└── Bundled Resources (optional)
├── scripts/ - Executable code for deterministic tasks
├── references/ - Docs loaded into context as needed
└── assets/ - Templates, icons, fontsEvaluation System
- Parallel Testing — Run with-skill vs baseline simultaneously
- Quantitative Metrics — Token usage, timing, pass rates
- Qualitative Review — Browser-based eval viewer for human feedback
- Benchmark Analysis — Statistical comparison with variance analysis
Description Optimization
- Generate 20 realistic eval queries (should-trigger vs should-not-trigger)
- Run automated optimization loop (up to 5 iterations)
- Select best description by test score (not train score)
Best Practices
- ✅ Keep SKILL.md under 500 lines
- ✅ Use imperative form in instructions
- ✅ Explain the "why" behind instructions
- ✅ Include realistic examples
- ✅ Make descriptions "pushy" to combat undertriggering
- ❌ Avoid heavy-handed MUSTs and rigid structures
Example Workflow
"I want to make a skill for X"
↓
Interview → Draft → Test → Evaluate → Iterate → Package