1. What a skill actually is #
A Claude Code skill is a packaged capability — a folder on disk that bundles a single SKILL.md file (the instructions), optional helper scripts, and optional reference material. Claude reads the description field of every available skill at startup. When your prompt matches one semantically, Claude invokes the skill, which dumps the full SKILL.md into context and lets Claude follow its rules.
That's it. No plugins. No runtime. No compilation. A skill is a Markdown file with a useful description.
Skills live in ~/.claude/skills/<skill-name>/ (user-level) or <project>/.claude/skills/<skill-name>/ (project-level). Each directory must contain a SKILL.md; anything else (shell scripts, reference docs, templates) is optional and called by the skill's instructions.
2. Skills vs subagents vs MCP #
Here's the opinion most of the internet will fight me on: most people reach for MCP servers when they should be writing a skill. The three mechanisms exist for different problems. Pick by what you're actually doing, not what's newest.
Skill
- You have expertise about how to do a thing
- Claude already has the tools; you just need to teach it the workflow
- Cheap to write, cheap to iterate
- Examples: "how to review a PR", "how to write a cold email", "how to debug a caching issue"
Subagent
- You need parallel or isolated work
- Fresh context window so the main conversation stays clean
- Research that would otherwise blow up the transcript
- Examples: "explore the codebase and report", "run three competing approaches in parallel"
MCP Server
- You need Claude to reach a system it can't already access
- New tool endpoints: a database, a SaaS API, an internal service
- Real code, real deployment, real auth
- Examples: query Postgres, call a Stripe endpoint, read your Notion workspace
SKILL.md, set the description to trigger on "review this PR", done in 5 minutes. An MCP server for the same thing is 5 days of work you didn't need to do.
A concrete rule
Ask these in order. The first yes picks the mechanism:
- Does Claude need to call a tool that doesn't exist yet (new API, new data source)? → MCP server.
- Does the work need a fresh context window or parallel execution? → Subagent.
- Are you teaching Claude how to do a thing with tools it already has? → Skill.
If you're answering "sort of" to all three, you probably want a skill that invokes a subagent when it needs parallel work. Compose, don't conflate.
3. Anatomy of a SKILL.md #
---
name: deploy-service
description: Deploy a Node or Python service to a VPS via rsync
and systemd. Use when the user asks to "deploy", "ship to prod",
"push to server", or mentions a specific service by name.
Do NOT use for frontend static deploys, CI-managed deploys, or
first-time server setup.
---
# Deploy Service
Quick guide for deploying application code to the project's VPS.
## Pre-flight
1. Run tests: `npm test` or `pytest`.
2. Verify no uncommitted changes: `git status --porcelain` returns empty.
3. Confirm target: ask the user if unclear which service.
## Sync
```bash
rsync -avz \
--exclude venv/ --exclude __pycache__/ --exclude .git/ \
./ user@vps:/opt/<service>/
```
## Restart
```bash
ssh user@vps "sudo systemctl restart <service>"
```
## Verify
Hit the health endpoint. Report the HTTP code and the first
50 lines of recent logs if it's not 200.
## Rules
- Always sync before restart, never the other way.
- Never skip pre-flight even if the user says "just deploy".
- If the service doesn't come back healthy in 30s, roll back.
That's a complete, working skill. The structure has three required pieces:
| Part | Purpose |
|---|---|
| Frontmatter | name and description. The description is what Claude matches against user intent. |
| Instructions | The actual workflow. Steps, commands, rules. |
| Guardrails | A "Rules" section that encodes what not to do. Often the most valuable part. |
4. The description field is the whole game #
Claude decides whether to invoke a skill based on one string: the description. Writing good descriptions is the single highest-leverage move you can make.
Good descriptions
- Lead with the primary use case in plain English.
- Include trigger vocabulary — words or phrases a user would actually type.
- State exclusions with "Do NOT use for...". Exclusions prevent wrong-invocation more than inclusions cause right-invocation.
- Mention distinguishing features when similar skills exist.
# Good: specific, with exclusions
description: Deploy a Node or Python service to a VPS via rsync
and systemd. Use when the user asks to "deploy", "ship to prod",
"push to server", or mentions a specific service by name.
Do NOT use for frontend static deploys, CI-managed deploys, or
first-time server setup.
Bad descriptions
# Bad: vague
description: Helps with deployment.
# Bad: name-drops without explaining when
description: The deploy skill.
# Bad: no trigger vocabulary
description: Handles shipping code to production environments
using modern DevOps practices.
None of these give Claude enough to decide. The first is too broad (triggers on every "deploy" mention, including CI). The second is recursive. The third sounds like a resume line.
The exclusion pattern
Skills that overlap in topic need explicit exclusions or Claude picks the wrong one. When you have three skills in the same area:
# skill: deploy-service
description: Deploy application code to the project VPS via rsync
and systemd. Do NOT use for logs/health checks (see
vps-ops), first-time server setup (see vps-bootstrap), or
frontend static deploys (see frontend-deploy).
# skill: vps-ops
description: VPS operations — logs, health checks, cleanup,
rollback of a deployed service. Do NOT use for deployments
themselves (see deploy-service).
# skill: frontend-deploy
description: Deploy static HTML/JS bundles to CDN. Do NOT use for
Node/Python services (see deploy-service).
The exclusions do 80% of the disambiguation work. Add them from day one, not after you notice Claude invoking the wrong skill.
5. Scoping: one skill or three #
New skill-writers usually err in one of two directions:
- Too big: one skill tries to do everything about a domain. Deploy, rollback, logs, health, bootstrap, migrations — all in one file. Claude loads 400 lines for every deploy question, most of it irrelevant.
- Too small: twelve skills for twelve commands. Claude has to remember twelve descriptions to pick the right one; the matching logic frays.
The right unit is a workflow a human would think of as one task. "Deploy a service" is one workflow. "Run ops on a deployed service" is another. "Migrate a database schema" is a third. Each gets its own skill. Within each, full detail.
A scoping heuristic
Write the skill's description first. If you find yourself writing "handles X, Y, and Z", split it. If you find yourself writing "handles only X in the case of Y when Z", merge it with a sibling.
6. Helper files & scripts #
A skill directory can hold anything. When your SKILL.md points to a helper file, Claude reads it when needed.
~/.claude/skills/deploy-service/
├── SKILL.md # instructions (required)
├── preflight.sh # reusable check script
├── rollback.sh # recovery script
├── templates/
│ └── systemd.service # template unit file
└── reference/
└── NGINX_PATTERNS.md # extra context the skill can pull in
Why this matters: skill context is a scarce resource. Keeping SKILL.md terse and pushing verbose details into reference files means the skill loads cheaply but can pull in depth when needed. Write the skill to say "for the full nginx pattern list, read reference/NGINX_PATTERNS.md only when the user asks about routing".
When to include a helper script
- The command has multiple arguments that are error-prone to type.
- The same script is used across skills.
- You want to version-control the exact command across machines.
When not to: if the command is one line, just put it in SKILL.md. Extracting it to a script you need to source or bash buys you nothing.
7. Worked example #
Turning a recurring problem into a skill, start to finish.
The problem
Every time you investigate a production incident, you copy-paste the same four commands: check service status, tail logs, hit the health endpoint, look at recent commits. You explain the process to Claude every time. That's a skill.
Step 1 — name it
Good skill names are lowercase, hyphenated, and verb-led or noun-led consistently:
incident-triage
Step 2 — write the description
Make it specific, with exclusions:
description: Triage a suspected production incident — service
status, recent logs, health endpoint, recent commits. Use when
the user says "something is broken", "service is down", "users
are reporting errors", or asks to "investigate" an outage.
Do NOT use for causing a deployment, for writing post-mortems
(see postmortem-writer), or for long-running root-cause analysis
(use debugging skill instead).
Step 3 — write the workflow
# Incident Triage
Fast triage of a suspected production incident. Goal: is it still
happening, and what's the blast radius. Not: root cause.
## Step 1 — Confirm the symptom
Ask: "What are you seeing? Which service? Since when?"
Do not start commands until you have a target service.
## Step 2 — Service status
```bash
ssh vps "sudo systemctl status <service>"
```
Report the unit state (active / failed / restarting) and last 5
log lines from the status block.
## Step 3 — Recent logs
```bash
ssh vps "sudo journalctl -u <service> -n 100 --no-pager"
```
Scan for ERROR, CRITICAL, or panic traces. Summarize in 3 lines.
## Step 4 — Health endpoint
```bash
curl -s -o /dev/null -w "%{http_code}\n" https://<host>/health
```
## Step 5 — Recent commits
```bash
git log --oneline -20 origin/main
```
Flag any merge in the last 24h that touched the service.
## Step 6 — Report
Return a 4-line summary: what's broken, since when (from logs),
what changed recently (from git log), recommended next action.
## Rules
- Triage only. If the user wants to deploy a fix, hand off to
deploy-service.
- Never restart a service without explicit user approval.
- If any step produces >20 lines of output, compress before reporting.
Step 4 — test it
Open a fresh conversation and type "something's wrong with the api, users are reporting errors". Claude should invoke incident-triage and start at Step 1. If it doesn't, the description is underspecified.
Step 5 — iterate
The first version of every skill is wrong in small ways. Watch it run for a week. Add exclusions when it over-triggers. Add steps when it misses obvious checks. Prune steps that always produce noise. A skill settles after 3-5 iterations; after that, it's invisible infrastructure.
8. When skills fail #
Overlapping triggers
Two skills with similar descriptions both match the user's intent. Claude picks one deterministically but not always the one you want.
Fix: add mutual exclusions. Each skill's description should name the other by id and say "do NOT use for that case".
The cold-start bloat problem
Claude reads every skill's description at startup. Sixty skills is fine. Three hundred skills with sloppy descriptions is not — the "which skill to invoke" decision gets noisy. Keep descriptions tight and resist skill sprawl.
The low hit-rate trap
You write a skill, then notice a week later Claude never invokes it even when it should. Cause is almost always one of:
- Description doesn't contain vocabulary the user actually uses.
- A broader skill is absorbing the trigger.
- The skill name is too generic (
helper,assist) and Claude routes past it.
Fix: sit with a fresh user message that should invoke your skill. Read your description as if you'd never seen the skill before. Does it obviously match? If not, rewrite.
The silent drift problem
Code and infrastructure change; skills don't, unless you edit them. A skill that hard-codes a command that has since been renamed will fail silently — Claude runs the command, gets an error, improvises around it, and the skill becomes lore nobody updates.
Fix: when a skill produces an error twice, treat it as a bug report on the skill, not a one-off. Edit SKILL.md.
9. The skill-creator loop #
The best way to write skills is to let Claude help you write them. The pattern:
- Notice that you just explained the same workflow to Claude for the third time.
- Ask Claude: "Write this as a skill. Frontmatter with name and description. The description should include triggers I'd naturally use and exclusions for adjacent skills."
- Claude drafts the skill.
- You read it carefully — especially the description — and edit until you'd bet money on it triggering correctly.
- Drop it into
~/.claude/skills/<name>/SKILL.md. - Test on a fresh conversation.
A meta-skill called skill-creator — one skill whose whole job is to write other skills — makes this loop even shorter. Its description triggers on phrases like "make this a skill" or "write a skill for this". Its body knows the frontmatter format, the exclusion pattern, and the naming conventions. Hand it a workflow, get back a working skill file.
10. Examples gallery #
Five representative skill patterns, each a sketch of what ends up in SKILL.md.
Infrastructure: vps-deploy
Deploy application code to a VPS via rsync and systemd. Covers pre-flight checks, sync, restart, and health verification. Rollback is a separate skill (
vps-rollback).
Ops: cache-debug
Debug a multi-layer cache returning stale data. Pattern: identify the layer, verify TTL, check invalidation logic, confirm source-of-truth. Includes a checklist for Redis + in-memory + CDN caches.
Meta: skill-creator
Write a new Claude Code skill. Triggers on "make this a skill", "write a skill for this". Produces a directory with
SKILL.mdincluding proper frontmatter, description with triggers and exclusions, and a body with steps plus a Rules section.
Domain workflow: code-review
Review a pull request against security, performance, and style checklists. Does not write tests (
test-writer), does not perform deep security audits (security-review).
Research: web-research
Answer a factual question using web search with source verification. Triggers on "research", "look up", "find out about". Rules: cite sources, flag single-source claims, timeout after 5 queries.
Notice what's common across all five: narrow scope, explicit exclusions, a verb-led name, and a description that includes the vocabulary a user would actually type.
11. Rules & pitfalls #
- Start with the description. Write it before the body. If you can't write a good description, the skill's scope is wrong.
- Always include exclusions. Even before they seem needed.
- One workflow per skill. Resist merging adjacent workflows into one big skill.
- Keep
SKILL.mdterse. Push verbose reference material into companion files the skill reads on demand. - Add a Rules section. What not to do matters as much as what to do.
- Version-control skills. Especially the ones Claude invokes on production-adjacent systems. Skills are code.
- Test on a fresh conversation after every material edit. Carry-over context can mask triggering failures.
- Don't build an MCP server for something a skill could do with the tools Claude already has.
- When a skill errors twice, treat it as a bug in the skill, not a bad run.
- Delete skills you stop using. Dead skills dilute the triggering decision for live ones.