Claude Code Skills: An Anatomy

What a skill actually is, when to reach for one versus an MCP server or a subagent, and how to write skills that Claude will actually invoke. Lived-in lessons from shipping 80+ of them.

claude-code skills ~15 min read
Contents
  1. 1.What a skill actually is
  2. 2.Skills vs subagents vs MCP
  3. 3.Anatomy of a SKILL.md
  4. 4.The description field is the whole game
  5. 5.Scoping: one skill or three
  6. 6.Helper files & scripts
  7. 7.Worked example
  8. 8.When skills fail
  9. 9.The skill-creator loop
  10. 10.Examples gallery
  11. 11.Rules & pitfalls

1. What a skill actually is #

A Claude Code skill is a packaged capability — a folder on disk that bundles a single SKILL.md file (the instructions), optional helper scripts, and optional reference material. Claude reads the description field of every available skill at startup. When your prompt matches one semantically, Claude invokes the skill, which dumps the full SKILL.md into context and lets Claude follow its rules.

That's it. No plugins. No runtime. No compilation. A skill is a Markdown file with a useful description.

The whole mental model A skill is "a chunk of expertise I would otherwise paste into every conversation, now stored once on disk with an auto-invocation trigger." Everything else about skills follows from this.

Skills live in ~/.claude/skills/<skill-name>/ (user-level) or <project>/.claude/skills/<skill-name>/ (project-level). Each directory must contain a SKILL.md; anything else (shell scripts, reference docs, templates) is optional and called by the skill's instructions.

2. Skills vs subagents vs MCP #

Here's the opinion most of the internet will fight me on: most people reach for MCP servers when they should be writing a skill. The three mechanisms exist for different problems. Pick by what you're actually doing, not what's newest.

Skill

  • You have expertise about how to do a thing
  • Claude already has the tools; you just need to teach it the workflow
  • Cheap to write, cheap to iterate
  • Examples: "how to review a PR", "how to write a cold email", "how to debug a caching issue"

Subagent

  • You need parallel or isolated work
  • Fresh context window so the main conversation stays clean
  • Research that would otherwise blow up the transcript
  • Examples: "explore the codebase and report", "run three competing approaches in parallel"

MCP Server

  • You need Claude to reach a system it can't already access
  • New tool endpoints: a database, a SaaS API, an internal service
  • Real code, real deployment, real auth
  • Examples: query Postgres, call a Stripe endpoint, read your Notion workspace
The most common mistake Building an MCP server for something that should be a skill. "I want Claude to follow my team's code review process" is not an MCP problem — Claude can already read files and write comments. It's a skill problem: write down the review process in SKILL.md, set the description to trigger on "review this PR", done in 5 minutes. An MCP server for the same thing is 5 days of work you didn't need to do.

A concrete rule

Ask these in order. The first yes picks the mechanism:

  1. Does Claude need to call a tool that doesn't exist yet (new API, new data source)? → MCP server.
  2. Does the work need a fresh context window or parallel execution? → Subagent.
  3. Are you teaching Claude how to do a thing with tools it already has? → Skill.

If you're answering "sort of" to all three, you probably want a skill that invokes a subagent when it needs parallel work. Compose, don't conflate.

3. Anatomy of a SKILL.md #

---
name: deploy-service
description: Deploy a Node or Python service to a VPS via rsync
  and systemd. Use when the user asks to "deploy", "ship to prod",
  "push to server", or mentions a specific service by name.
  Do NOT use for frontend static deploys, CI-managed deploys, or
  first-time server setup.
---

# Deploy Service

Quick guide for deploying application code to the project's VPS.

## Pre-flight

1. Run tests: `npm test` or `pytest`.
2. Verify no uncommitted changes: `git status --porcelain` returns empty.
3. Confirm target: ask the user if unclear which service.

## Sync

```bash
rsync -avz \
  --exclude venv/ --exclude __pycache__/ --exclude .git/ \
  ./ user@vps:/opt/<service>/
```

## Restart

```bash
ssh user@vps "sudo systemctl restart <service>"
```

## Verify

Hit the health endpoint. Report the HTTP code and the first
50 lines of recent logs if it's not 200.

## Rules

- Always sync before restart, never the other way.
- Never skip pre-flight even if the user says "just deploy".
- If the service doesn't come back healthy in 30s, roll back.

That's a complete, working skill. The structure has three required pieces:

PartPurpose
Frontmattername and description. The description is what Claude matches against user intent.
InstructionsThe actual workflow. Steps, commands, rules.
GuardrailsA "Rules" section that encodes what not to do. Often the most valuable part.

4. The description field is the whole game #

Claude decides whether to invoke a skill based on one string: the description. Writing good descriptions is the single highest-leverage move you can make.

Good descriptions

# Good: specific, with exclusions
description: Deploy a Node or Python service to a VPS via rsync
  and systemd. Use when the user asks to "deploy", "ship to prod",
  "push to server", or mentions a specific service by name.
  Do NOT use for frontend static deploys, CI-managed deploys, or
  first-time server setup.

Bad descriptions

# Bad: vague
description: Helps with deployment.

# Bad: name-drops without explaining when
description: The deploy skill.

# Bad: no trigger vocabulary
description: Handles shipping code to production environments
  using modern DevOps practices.

None of these give Claude enough to decide. The first is too broad (triggers on every "deploy" mention, including CI). The second is recursive. The third sounds like a resume line.

Rule of thumb If you cannot tell, reading only the description, which user messages should invoke this skill and which should not, the description isn't done yet.

The exclusion pattern

Skills that overlap in topic need explicit exclusions or Claude picks the wrong one. When you have three skills in the same area:

# skill: deploy-service
description: Deploy application code to the project VPS via rsync
  and systemd. Do NOT use for logs/health checks (see
  vps-ops), first-time server setup (see vps-bootstrap), or
  frontend static deploys (see frontend-deploy).

# skill: vps-ops
description: VPS operations — logs, health checks, cleanup,
  rollback of a deployed service. Do NOT use for deployments
  themselves (see deploy-service).

# skill: frontend-deploy
description: Deploy static HTML/JS bundles to CDN. Do NOT use for
  Node/Python services (see deploy-service).

The exclusions do 80% of the disambiguation work. Add them from day one, not after you notice Claude invoking the wrong skill.

5. Scoping: one skill or three #

New skill-writers usually err in one of two directions:

The right unit is a workflow a human would think of as one task. "Deploy a service" is one workflow. "Run ops on a deployed service" is another. "Migrate a database schema" is a third. Each gets its own skill. Within each, full detail.

A scoping heuristic

Write the skill's description first. If you find yourself writing "handles X, Y, and Z", split it. If you find yourself writing "handles only X in the case of Y when Z", merge it with a sibling.

6. Helper files & scripts #

A skill directory can hold anything. When your SKILL.md points to a helper file, Claude reads it when needed.

~/.claude/skills/deploy-service/
├── SKILL.md              # instructions (required)
├── preflight.sh          # reusable check script
├── rollback.sh           # recovery script
├── templates/
│   └── systemd.service   # template unit file
└── reference/
    └── NGINX_PATTERNS.md # extra context the skill can pull in

Why this matters: skill context is a scarce resource. Keeping SKILL.md terse and pushing verbose details into reference files means the skill loads cheaply but can pull in depth when needed. Write the skill to say "for the full nginx pattern list, read reference/NGINX_PATTERNS.md only when the user asks about routing".

When to include a helper script

When not to: if the command is one line, just put it in SKILL.md. Extracting it to a script you need to source or bash buys you nothing.

7. Worked example #

Turning a recurring problem into a skill, start to finish.

The problem

Every time you investigate a production incident, you copy-paste the same four commands: check service status, tail logs, hit the health endpoint, look at recent commits. You explain the process to Claude every time. That's a skill.

Step 1 — name it

Good skill names are lowercase, hyphenated, and verb-led or noun-led consistently:

incident-triage

Step 2 — write the description

Make it specific, with exclusions:

description: Triage a suspected production incident — service
  status, recent logs, health endpoint, recent commits. Use when
  the user says "something is broken", "service is down", "users
  are reporting errors", or asks to "investigate" an outage.
  Do NOT use for causing a deployment, for writing post-mortems
  (see postmortem-writer), or for long-running root-cause analysis
  (use debugging skill instead).

Step 3 — write the workflow

# Incident Triage

Fast triage of a suspected production incident. Goal: is it still
happening, and what's the blast radius. Not: root cause.

## Step 1 — Confirm the symptom

Ask: "What are you seeing? Which service? Since when?"
Do not start commands until you have a target service.

## Step 2 — Service status

```bash
ssh vps "sudo systemctl status <service>"
```
Report the unit state (active / failed / restarting) and last 5
log lines from the status block.

## Step 3 — Recent logs

```bash
ssh vps "sudo journalctl -u <service> -n 100 --no-pager"
```
Scan for ERROR, CRITICAL, or panic traces. Summarize in 3 lines.

## Step 4 — Health endpoint

```bash
curl -s -o /dev/null -w "%{http_code}\n" https://<host>/health
```

## Step 5 — Recent commits

```bash
git log --oneline -20 origin/main
```
Flag any merge in the last 24h that touched the service.

## Step 6 — Report

Return a 4-line summary: what's broken, since when (from logs),
what changed recently (from git log), recommended next action.

## Rules

- Triage only. If the user wants to deploy a fix, hand off to
  deploy-service.
- Never restart a service without explicit user approval.
- If any step produces >20 lines of output, compress before reporting.

Step 4 — test it

Open a fresh conversation and type "something's wrong with the api, users are reporting errors". Claude should invoke incident-triage and start at Step 1. If it doesn't, the description is underspecified.

Step 5 — iterate

The first version of every skill is wrong in small ways. Watch it run for a week. Add exclusions when it over-triggers. Add steps when it misses obvious checks. Prune steps that always produce noise. A skill settles after 3-5 iterations; after that, it's invisible infrastructure.

8. When skills fail #

Overlapping triggers

Two skills with similar descriptions both match the user's intent. Claude picks one deterministically but not always the one you want.

Fix: add mutual exclusions. Each skill's description should name the other by id and say "do NOT use for that case".

The cold-start bloat problem

Claude reads every skill's description at startup. Sixty skills is fine. Three hundred skills with sloppy descriptions is not — the "which skill to invoke" decision gets noisy. Keep descriptions tight and resist skill sprawl.

The low hit-rate trap

You write a skill, then notice a week later Claude never invokes it even when it should. Cause is almost always one of:

Fix: sit with a fresh user message that should invoke your skill. Read your description as if you'd never seen the skill before. Does it obviously match? If not, rewrite.

The silent drift problem

Code and infrastructure change; skills don't, unless you edit them. A skill that hard-codes a command that has since been renamed will fail silently — Claude runs the command, gets an error, improvises around it, and the skill becomes lore nobody updates.

Fix: when a skill produces an error twice, treat it as a bug report on the skill, not a one-off. Edit SKILL.md.

9. The skill-creator loop #

The best way to write skills is to let Claude help you write them. The pattern:

  1. Notice that you just explained the same workflow to Claude for the third time.
  2. Ask Claude: "Write this as a skill. Frontmatter with name and description. The description should include triggers I'd naturally use and exclusions for adjacent skills."
  3. Claude drafts the skill.
  4. You read it carefully — especially the description — and edit until you'd bet money on it triggering correctly.
  5. Drop it into ~/.claude/skills/<name>/SKILL.md.
  6. Test on a fresh conversation.

A meta-skill called skill-creator — one skill whose whole job is to write other skills — makes this loop even shorter. Its description triggers on phrases like "make this a skill" or "write a skill for this". Its body knows the frontmatter format, the exclusion pattern, and the naming conventions. Hand it a workflow, get back a working skill file.

Five representative skill patterns, each a sketch of what ends up in SKILL.md.

Infrastructure: vps-deploy

Deploy application code to a VPS via rsync and systemd. Covers pre-flight checks, sync, restart, and health verification. Rollback is a separate skill (vps-rollback).

Ops: cache-debug

Debug a multi-layer cache returning stale data. Pattern: identify the layer, verify TTL, check invalidation logic, confirm source-of-truth. Includes a checklist for Redis + in-memory + CDN caches.

Meta: skill-creator

Write a new Claude Code skill. Triggers on "make this a skill", "write a skill for this". Produces a directory with SKILL.md including proper frontmatter, description with triggers and exclusions, and a body with steps plus a Rules section.

Domain workflow: code-review

Review a pull request against security, performance, and style checklists. Does not write tests (test-writer), does not perform deep security audits (security-review).

Research: web-research

Answer a factual question using web search with source verification. Triggers on "research", "look up", "find out about". Rules: cite sources, flag single-source claims, timeout after 5 queries.

Notice what's common across all five: narrow scope, explicit exclusions, a verb-led name, and a description that includes the vocabulary a user would actually type.

11. Rules & pitfalls #

The operator's stance A good skill is invisible. You ask Claude to do something; it does the right thing; you forget the skill exists. A bad skill is loud — it fires when it shouldn't, or stays silent when it should. The goal isn't skills — it's workflows that run themselves.