SKILL.md open standard · progressive disclosure · Cursor / Claude Code · remote Mac validation
Who is this for? You already use Agent mode in Cursor, Claude Code, or Gemini CLI, yet you keep pasting the same instructions for deploy, PR creation, and security audits—burning context and inflating token bills. Bottom line: package those flows as an Agent Skill (a folder with YAML-frontmatter SKILL.md) so the Agent loads them on demand and reuses them across sessions; when scripts call xcodebuild or notarytool, validate output on a rented VNC remote Mac. In this article: Skill vs Rule → file spec → three-level loading → create and migrate → best practices → 2026 ecosystem → practical cases → FAQ.
The trajectory of AI coding agents is straightforward: chat assistant → task runner → agent with domain-specific playbooks. Raw prompts fail in three predictable ways. You re-describe complex procedures every session. Unrelated rules sit in context even when you are editing a single config file. Team knowledge lives in Slack threads instead of version-controlled files.
Agent Skills turn “how we do X here” into a reusable module: load on demand, spend fewer tokens, share across repos and editors. After Anthropic pushed the open standard at agentskills.io in late 2025, Cursor, Claude Code, OpenAI Codex, and Gemini CLI converged on the same SKILL.md directory layout. One Git folder can travel with your team regardless of which editor someone prefers today.
One-line definition: a Skill is an operator manual written for the Agent, not a human README. It tells the model when to act and in what order—so you stop re-teaching the same workflow from scratch.
Repeated labor: hand-typing deploy or PR prompts costs roughly 2k–8k tokens per session in typical traces.
Context pollution: stuffing 500 lines of conventions into Rules means every file edit carries weight that has nothing to do with the task.
No version control: verbal runbooks cannot go through PR review; onboarding becomes “copy this chat log.”
Platform silos: the same release checklist lived separately in Cursor and Claude Code until the shared Skill format let teams commit one source of truth.
| Dimension | Rule | Skill |
|---|---|---|
| Load timing | Always injected | On demand / when relevant |
| Best for | Standing conventions (naming, style) | Multi-step workflows (deploy, audit) |
| Context cost | Fixed overhead every turn | Dynamic; only active Skills expand context |
| Trigger | Automatic system prompt | Agent routing + /skill-name |
| Analogy | New-hire handbook | Specialist runbook for one job |
What else can Skills do? Define custom / commands (for example /deploy), bundle multi-step flows, inject domain knowledge, ship Bash/Python/Node scripts under scripts/, and coordinate with Hooks and MCP servers. Marketplace plugins inside products like OpenClaw solve a similar problem inside one runtime; this article focuses on the cross-editor SKILL.md file format you can commit to Git.
The canonical layout (folder name equals skill name):
.cursor/skills/
└── deploy-app/
├── SKILL.md # required
├── scripts/ # optional: executable helpers
│ ├── deploy.sh
│ └── validate.py
├── references/ # optional: docs loaded on demand
└── assets/ # optional: templates and static filesCompatible paths include .agents/skills/ (Claude Code, Codex, Gemini CLI), ~/.cursor/skills/ (user-wide), and ~/.agents/skills/ (cross-tool global). Monorepos may nest skills under package roots, for example apps/ios/.cursor/skills/notarize/SKILL.md, so routing stays scoped.
--- name: deploy-app description: >- Use when the user needs to deploy the app to staging or production. Triggers: deploy, release, ship, environment switch. paths: - "apps/web/**" disable-model-invocation: false --- # Deploy application ## Steps 1. Run `scripts/validate.py` to verify required env vars 2. Execute `scripts/deploy.sh <environment>` 3. Confirm the health check endpoint returns HTTP 200
| Field | Required | Notes |
|---|---|---|
| name | Yes | Lowercase with hyphens; must match folder name |
| description | Yes | Routing key: write trigger conditions, not a summary blurb |
| paths | No | Glob patterns limiting where the Skill applies |
| disable-model-invocation | No | When true, only manual /skill-name loads the Skill |
Agent Skills use progressive disclosure—documented on agentskills.io and in Cursor’s own Skill docs—so the model does not ingest every playbook at session start.
Discovery (session boot): the Agent reads only each Skill’s name and description to decide relevance.
Activation (match): the full SKILL.md body loads when the task fits the routing key.
On demand (execution): files under references/ load when steps cite them; scripts/ run locally and only their stdout/stderr returns to context—the script source itself does not consume tokens.
Trigger modes: automatic routing (default), manual /skill-name, or attaching context with @skill-name. For destructive operations, set disable-model-invocation: true so a human explicitly invokes the Skill before the Agent reads instructions.
Quotable facts: community registries listed more than 31,000 public Skills by early 2026; stuffing them all into Rules could inflate a single conversation context by an order of magnitude. Progressive loading lets the Agent know the shelf exists while opening only the book it needs. Teams report 2k–8k tokens saved per deploy-style session once repeat prompts move into Skills with script-backed verification steps.
Fastest path: in Cursor Agent chat, run /create-skill, describe the workflow, and let the Agent scaffold the folder plus SKILL.md.
Manual path: create .cursor/skills/your-skill-name/SKILL.md, fill frontmatter, and write numbered steps the model can follow without guessing.
Scripts: move repeatable shell commands into scripts/ and reference them with relative paths in the markdown body.
Verify discovery: open Cursor Settings → Rules, or start a fresh Agent session, and confirm the Skill appears in the available list.
Migrate legacy rules: on Cursor 2.4+, run /migrate-to-skills to convert dynamic rules and slash commands into Skill folders.
Windows-first developers can author Skills locally, but when scripts/ calls bash, xcodebuild, notarytool, or security find-identity, you need real macOS to validate exit codes and logs. Recommended flow: push the repo to a rented remote Mac → connect over VNC → trigger the Skill in Cursor or Terminal → confirm script output. Keychain unlock prompts, notarization staples, and TCC permission dialogs require a graphical VNC session—SSH alone cannot click “Always Allow.”
1. Treat description as a routing key, not marketing copy. Weak: “This skill contains deploy instructions.” Strong: “Use when the user mentions deploy, release, staging, or production environment switches.”
2. Progressive disclosure in authoring: keep SKILL.md under roughly 500 lines; push deep reference material to references/; push deterministic commands to scripts/.
3. Single responsibility: one Skill per domain. Split “deploy + security audit + write tests” into three Skills the Agent can compose.
4. Explain why, not only what: write “Run validate.py before deploy to catch missing env vars that would crash the service at boot,” not merely “Run validate.py.”
5. Gather → Act → Verify: structure steps so the Agent collects prerequisites, executes, then checks outcomes—enabling sane recovery when something fails mid-flow.
6. Consistent terminology and explicit failure policy: use forward slashes in paths; state whether a script failure should retry, roll back, or halt the workflow.
| Category | Example Skills | Typical use |
|---|---|---|
| Developer productivity | Prompt Lookup, Skill Installer, Ralph coding loop | Better prompts, install other Skills, TDD-style autonomous iteration |
| Frontend / Web | React Best Practices, Web Design Audit, Next.js cache optimizer | Performance and accessibility reviews (several maintained by Vercel) |
| Workflow | PR Skill, TDD Skill, Skill Writer | Commit → push → open PR, test-driven flows, meta-Skill authoring |
| Creative | Remotion video editor | Describe video edits in React and render clips |
Ecosystem notes for 2026: agentskills.io documents the portable format across vendors. Cursor Marketplace bundles Rules, Skills, and MCP configs for one-click install. In monorepos, nested paths like .cursor/skills/shipping/deploy-staging/SKILL.md isolate scope without polluting unrelated packages.
Skills vs MCP: MCP connects tools (databases, browsers, GitHub APIs). Skills orchestrate when and in what order to use them. A PR Skill might run git status, then call a GitHub MCP tool to open the pull request—complementary layers, not substitutes.
Second stat pack for slides: Cursor stable Skill support from 2.4+; public registry growth past 31k Skills; typical token savings 2k–8k per repeated workflow once prompts move out of chat history and into versioned files.
Scenario: every release repeats commit → push → create PR → fill the template. Skill instructions: stage changes, push a feature branch, run gh pr create, populate the description from your team template. One /pr invocation replaces a five-minute paste ritual.
Scenario: iterate until tests pass. Skill logic: run the test suite; on failure, let the Agent patch code and rerun; loop until green. Pair with Hooks to wake the Agent when CI fails overnight—the Skill supplies the playbook, Hooks supply the event.
Scenario: a Windows developer maintains a Skill whose scripts call xcodebuild archive and notarytool submit. Local Cursor can edit Markdown; real acceptance happens on a VNCMac remote Mac: VNC in, trigger the Skill, let the Agent execute scripts, click Keychain “Always Allow” when prompted. This is the same “SSH is not enough” pattern as notarization staple checks on physical Apple hardware.
| Acceptance item | Local Windows | Remote Mac + VNC |
|---|---|---|
| Edit SKILL.md | Yes | Yes |
| Run bash deploy scripts | Partial (WSL gaps) | Full macOS environment |
| xcodebuild / notarization | No | Required |
| Keychain / TCC dialogs | No | VNC click-through |
Skills are structured guidance, not hard-coded automation. The model still chooses actions; clearer descriptions and explicit verify steps improve consistency. Review community Skills—especially scripts/—before you run them on production repos.
Test with real tasks that match the trigger phrases in description. If auto-routing misses, invoke manually with /skill-name. Still failing? Add synonyms to the first paragraph of the description—that field is the routing index.
Put generic workflows (commit hygiene, test loops) in ~/.cursor/skills/. Put repo-specific paths, internal APIs, and deploy targets in .cursor/skills/ at the repository root and commit to Git so PRs can review Skill changes like code.
Similar idea (reusable capability modules), different distribution. OpenClaw Skills install through ClawHub and run inside its Gateway plugin model. Cursor Agent Skills follow SKILL.md plus agentskills.io and live beside your repo. Many teams use both: OpenClaw for always-on messaging agents, Cursor Skills for daily coding workflows.
Agent Skills turn tribal knowledge—the steps only senior engineers remember—into versioned, reviewable SKILL.md playbooks. Write trigger conditions in description, structure the body as Gather-Act-Verify, and push deterministic work into scripts/ so progressive loading actually saves tokens. Done well, the Agent stops feeling like a new hire every Monday.
The ceiling is environmental: Skills that compile Apple platforms, notarize binaries, or touch the Keychain cannot be fully validated on Windows. SSH into a remote Mac still leaves system dialogs unreachable. Buying a Mac mini makes sense when Skills run 24/7 in production; while you are still building the library or only ship iOS/macOS releases a few times a month, renting a VNC remote Mac splits authoring on your daily machine from acceptance on real Apple hardware—often cheaper than hardware you idle between releases.
A Skill only compounds value once the environment runs it truthfully. Open remote Mac plans below when you need a VNC node to validate macOS-specific Skill scripts before you merge.