Level 05 · 65 → 85
Prompting AI agents
The actual craft of working with coding agents — Claude Code, Cursor, v0, Lovable, Bolt. How to give context, scope tasks, write specs, read diffs, debug iteratively, manage long sessions, and recognize when the agent has gone off the rails. The bad-prompt-vs-good-prompt skill, taught throughout.
The agent landscape — what each one is for
"AI coding agent" is now a category, not a product. The tools differ in where they sit, what they can touch, and how autonomous they are.
| Tool | Surface | Sweet spot |
|---|---|---|
| Claude Code | Terminal / CLI | End-to-end agentic coding, file ops, running shells. The most powerful for real codebases. |
| Cursor | IDE (VS Code fork) | You stay in editor; agent has full project context. Inline edits, agent mode for bigger tasks. |
| Windsurf | IDE (VS Code fork) | Similar surface to Cursor. Both are fine; pick one and learn it. |
| GitHub Copilot | IDE plugin | Inline completions and chat. Lightest touch — least autonomous, often that's right. |
| v0 (Vercel) | Web — generates UI | "I want a settings page that looks like X." Outputs React + Tailwind components you copy. |
| Lovable | Web — generates apps | Whole-app generation from a prompt. Good for rapid prototypes. |
| Bolt | Web — generates apps | Browser-based full-stack generator with live preview. |
| Replit Agent | Web — cloud IDE | Replit's environment + agent. Good for "I have an idea and no laptop." |
The categories that matter:
- Terminal / CLI agents
- Claude Code is the lead. They live in your shell, can run commands, edit files anywhere. Best for working in a real codebase you already have.
- IDE-resident agents
- Cursor, Windsurf, Copilot. The agent has your project's full context — open files, related code, conventions. Best for steady, day-in-day-out development.
- Generators (browser-based)
- v0, Lovable, Bolt, Replit Agent. Best for greenfield projects, prototypes, or specific UI you can copy out. Less ideal once you have a serious codebase to integrate with.
Context is the work
Here's the most important sentence in this level: the quality of an agent's output is almost entirely a function of the quality of the context you give it. Models are smart. They are not psychic.
"Context" means everything the agent can see when it answers — the files in your project, the prompt you wrote, the conversation so far, the README, the schema, the conventions. The agent's whole understanding is what's in its window. Anything outside it might as well not exist.
Three categories of context you control:
- Project context
- The codebase itself. Structure, naming, framework choices, design system. CLI agents read files freely; IDE agents see open tabs and project-wide search. The cleaner your project is organized, the better the agent's reads.
- Conversational context
- What you've said in this session. The agent remembers, but only up to a budget — past a certain length it'll start losing earlier turns.
- Out-of-band context
- Things the agent can't see: the PRD in Notion, the discussion in Slack, the bug report in Linear, the user's screenshot. You have to bring these into the conversation explicitly.
/u/[username]. It should show: avatar, display name, bio (max 160 chars), join date, and the user's last 10 public tasks. Match the styling of the existing project page (app/p/[slug]/page.tsx). Use the User model already in prisma/schema.prisma; add a bio field if it's missing. Don't add new dependencies."Scoping a task — small enough to land cleanly
The unit of work for an agent is a scoped task — small enough that you can describe its end state in two or three sentences, evaluate it in five minutes, and revert it cleanly if it's wrong.
Tasks that are too big:
- "Build the auth system." (Multi-day. Many sub-decisions. Will go off the rails.)
- "Refactor the codebase to use a state library." (Touches everything. High blast radius if anything's wrong.)
- "Make the app feel snappy." (Subjective. No success criterion.)
The same work, scoped:
- "Add a magic-link send endpoint at
POST /api/auth/send-link. Use Resend. Rate-limit to 3 per email per hour." - "Replace the manual
useStateinapp/board/page.tsxwith auseTasksStoreZustand store. Don't touch other files." - "The board page takes 800ms to first paint on slow 3G. Suspect: the
tasksquery loads 10× more data than it shows. Audit the query, suggest a leaner one."
Writing a spec the agent can act on
For tasks past trivial, write a tiny inline spec. The structure that works:
Goal
One sentence. What, not how. "Users can mark a task as complete from the board view."
Constraints
What the agent can't change. "Don't touch the API. Use the existing
updateTaskhook. Match the pattern inTaskCard.tsx."Acceptance
How you'll know it's done. "Clicking the checkbox toggles the status. The toggle is optimistic; if the API errors, it reverts. The card has a strikethrough when done."
Out of scope
What you don't want — pre-empts the agent's tendency to overshoot. "Don't add animations. Don't refactor anything else."
Reading the diff — the actual job
The agent produced a change. Now your job — and it is a job, not a glance — is to read it. Reading a diff is the single most important skill in agent-driven development. If you don't read what the agent wrote, you don't understand your own software.
What to look for, in order:
- Did it touch only the files you expected? If your prompt was about the auth flow and it edited
tailwind.config.ts, ask why before you accept. - Did it use existing code where possible? Or did it duplicate a helper that already exists in
/lib? Agents have a tendency to write fresh code instead of reaching for what's there. - Are the names consistent with the project? If everything in your project is camelCase and the new file uses snake_case, that's a smell.
- Are there assumptions you don't agree with? The agent had to make tiny choices — naming, error-handling style, where state lives. Disagree out loud.
- Are there things that are subtly wrong but compile? A function that returns the wrong thing on edge cases. A check that's slightly off. A missing await.
app/api/tasks/route.ts line 23, you wrote where: { id: taskId } — but multiple workspaces could have tasks with the same ID. Should this also filter by workspaceId? Confirm before I merge."The debug loop with an agent
The code doesn't work. What now?
Reproduce it
Make the bug happen consistently. "It sometimes breaks" is unfixable. "Click create-task with no title; the page goes blank" is fixable.
Capture the evidence
The exact error, the network response, the console output, the screenshot. Paste these into the agent — don't paraphrase.
State the expected vs. actual behavior
"Expected: form shows a 'title required' error inline. Actual: page goes blank. Console shows TypeError: undefined is not a function."
Tell the agent what you've already tried
Stops it from suggesting the same fix you just ruled out.
Ask for the smallest fix that addresses the root cause
Not "refactor this whole component." A surgical change.
Verify before celebrating
Re-run the steps that produced the bug. Confirm it's actually fixed, not just changed.
TypeError: Cannot read properties of undefined (reading 'trim') at TaskForm.tsx:47. Expected: an inline validation error. I've already tried wrapping line 47 in a null check; that just hides the symptom. Find the root cause and fix it cleanly."Managing long sessions
Agents have a context budget. Past some length, the conversation starts to lose its earliest turns. The PRD you pasted in turn 3 is forgotten by turn 30. The convention you established at the start has drifted.
Tactics for keeping long sessions coherent:
- Persist the spec to disk. Put your PRD, architecture decisions, and conventions in
/docsin the repo. CLI agents will read them on demand. You won't have to re-paste. - Use a "memory file." A
CLAUDE.md,.cursorrules, orAGENTS.mdat the project root that the agent reads at session start. Put your conventions there: naming, framework versions, "always use X library, never Y." - Compact and restart. When a session gets long, summarize it ("here's what we built, here's what's left") and start fresh. The agent gets more lucid.
- Commit frequently. If a session goes wrong, you can roll back to a known-good point — only possible if you've been committing small chunks.
- Branch per feature. Same reason — keeps each session's changes isolated and revertable.
When to trust the agent — and when to step in
The agent is a brilliant junior engineer with no memory and infinite confidence. Trust accordingly.
Trust agents with:
- Boilerplate (forms, CRUD endpoints, repetitive UI).
- Unfamiliar APIs (read docs faster than you can).
- Translations of clear specs into clean code.
- Refactors with strong tests covering the affected code.
- Glue code, scaffolding, naming conventions, formatting.
Step in for:
- Architectural decisions. Where does state live? What's the API shape? These compound; a bad call here is expensive later.
- Anything security-sensitive. Auth flows, permission checks, secret handling. Agents will write code that compiles and is dangerous.
- Performance-sensitive paths. The agent doesn't know which endpoint is hit a million times per day.
- Domain logic. Pricing rules, billing, weird edge cases specific to your business.
- Anything you don't understand the output of. If you can't read the diff, don't merge it.
Handling hallucinated APIs
Agents sometimes invent functions, libraries, or API endpoints that don't exist. Confidently. The output looks right; the code won't run; the documentation it cites is fictional.
How to catch and prevent it:
- Verify imports exist. If the agent imports
@radix-ui/react-magic, check that package is real beforenpm installfails on it. - Cross-check API method names against docs. Especially for libraries with multiple major versions — agents conflate v1 and v3 syntax routinely.
- Pin versions in the prompt. "We're on Prisma 5.8 and Next.js 14.2" cuts down on mismatched syntax.
- Run the code. The fastest hallucination filter is the type checker and the dev server. If it doesn't run, something is wrong.
- When the agent insists, ask for a citation. "Show me the docs page where this method is documented." If it can't, the method is suspect.
PRD → tickets → prompts
The full pipeline for building a feature with an agent, from idea to merged code:
The artifacts at each stage:
- PRD (one page)
- Problem · Approach · Acceptance · Non-goals. Lives in
/docsin the repo. Anyone — including the agent — can read it. - Tickets (small)
- Each one a deliverable: title, description, acceptance criteria, files likely to touch. The agent helps you generate these from the PRD.
- Prompts (per ticket)
- The actual ask. Goal + constraints + acceptance + out-of-scope. Often a sentence or two with the ticket as the source of truth.
- Diff (the output)
- The agent's change. You read it.
- PR (the package)
- The diff with description, screenshots, and self-review. You (or a teammate) approves and merges.
See the Prompt Library for paired bad-vs-good examples at each stage of this pipeline.
Wrap-up
Jargon recap
- Agent
- An LLM with tools — can read files, run shells, edit code.
- Context
- Everything the agent sees when it answers. Project + conversation + what you paste.
- Spec
- Mini-document for a task: goal, constraints, acceptance, out-of-scope.
- Diff
- The change the agent produced. Always read.
- Hallucination
- Agent invents API/method/library that doesn't exist.
- Memory file
- AGENTS.md / CLAUDE.md / .cursorrules — agent reads at session start.
- PRD → tickets → prompts
- Pipeline from idea to shipped code.
- LGTM
- "Looks good to me." Approving a PR — say it only after you've actually read it.
You should now be able to
Mini-exercise
Take a feature you want and write the full pipeline for it on paper: a 200-word PRD, three tickets that decompose it, and a prompt for the first ticket. Don't run any of it through an agent — just notice how much sharper the eventual prompts will be when the upstream artifacts exist.