From Cursor to Claude Code

 

May 18, 2026

Last week I switched from Cursor to Claude Code and spent most of my time watching the extended edition of Lord of the Rings with my mom, knocking out Anthropic courses at 2x speed, and looking for a dress for my cousin's wedding.

This is the post about that week.

New Tools, New Problems

Switching to Claude Code meant the workspace itself needed shaking out before any feature work would be responsible. So I held off. I let Claude Code focus on doc updates and code generation specifically because I wanted to feel the friction of the new setup against work I already understood, rather than discover the friction mid-feature and have to untangle which problem was which.

It was the right call. It was also boring. Three nights of extended edition, then the theatrical cuts. Skilljar in another tab. Pinterest in another.

A few good things happened that were not architectural. I finally set up staging environments, which I’d neglected due to being solo and pre-launch. I set up GitHub pull requests with the test suite as a merge gate on both repos. Responsible-adult work. The kind of thing a new tool makes you do because you are already touching everything anyway.

My Old Workflow

My Cursor workflow was rigid and I liked it that way.

Each session started with an onboarding flow. General context first — the conventions of the codebase, the inversions, the build philosophies — so a fresh agent could develop something like common sense. Then detailed reading of the area I wanted to work on, plus any relevant code elsewhere worth imitating. An "Agent Start Here" doc pointed each new session at where to begin and where to stop in the first step. The second step's reading list came from me in conversation.

Then I would describe the work by reference to something already built. Make a bleed-out condition using the same pattern as zombie infection, but with physical status as the trigger instead of narrative status.

Then the agent would give me a plan, and I would have it prove to me why the plan would not work, using my own codebase and docs as evidence. New plan. Repeat until it could not find any more issues. Then I would review it myself.

Once I was satisfied: run the test suite to establish a baseline. Execute the plan. Run the suite again, treating unpredicted failures as bugs. Test the builds. Then — and only then — add test coverage for the new feature. Test the builds again. Run the suites again. Update all the docs including test counts. Then push and deploy.

Zero memory across sessions. Fresh context from the codebase and docs each time. It put the maintenance burden on documentation, which I liked, because documentation drift from code is visible.

What Claude Code Offered

Claude Code's pitch is partly that you do not have to work that way. Skills, hooks, persistent memory. The agent learns the codebase across sessions instead of being onboarded into it.

I assumed the new environment would require a new workflow. So I tried building one. Different memory configurations. Hooks for the failure modes I kept hitting. Skills for repeated patterns. The first few days went into scaffolding to keep the agent on rails.

Things got weird quickly.

Agents started telling me I was working too slowly. Mid-refactor — moving some client logic to the server — an agent told me we could not polish the engine for another month. It had been two hours. The agent was estimating elapsed time of our work session from cognitive load, not commit history. This happened more than once.

Subagents got dispatched with partial context, came back with confidently wrong recommendations, and the primary agent recorded those recommendations as facts. Next session inherited the false memory. The codebase inverts a lot of training-data conventions — items live in locations, users are locations that happen to hold items, doors are composed of two door-sides, adjacency is derived from where those sides spatially exist — so a subagent reading a slice without orientation will propose normalizing the door table or extracting an inventory system. That recommendation, written into memory as established fact, becomes the next session's prior. The primary agent does not know it inherited a hallucination. Neither do I, unless I catch it.

I also noticed I was reaching for skills in places where what I actually wanted was code generation. A skill encodes a pattern the agent follows. Codegen encodes the same pattern as deterministic output. The first depends on the agent; the second does not.

That became ADR-013. The codebase has zero runtime dependency on AI tooling. Skills and hooks are transitional aid for the agent, not infrastructure the codebase relies on. The preference order is: codebase self-enforces (non-null records, type system, default values) → dependency-zero tooling (bash scripts, CI gates, codegen) → AI-side tooling (skills, hooks) only when the first two are genuinely impractical. The test is whether the codebase would still function and still be improvable by a human if Claude disappeared overnight.

Memory was the harder one. Tonight I was working through a round of memory and doc edits with an agent and realized I was updating the same information in two places. The memory and the docs were both serving as a corrective gate in front of the agent's defaults, and they had already drifted within hours. Two gates that can silently disagree are strictly worse than one, because the agent trusts whichever it reads and nothing tells it which is stale. Code-to-doc drift is detectable. Gate-to-gate drift is not.

That became ADR-016. Agent memory is reduced to a single personal-calibration file. All durable rules, rationale, and corrective gates live in the docs (and the test suite). Understanding derived from the current repo, every session, no stored projection.

In other words

Both ADRs are extensions of the engine's build philosophy one layer up. The engine prefers shared primitives, mirrored patterns, no special cases, derived state over stored state. Skills are special cases at the workflow layer. Hooks are guardrails for specific failure modes. Memory is stored state where derived state would be more honest. Re-reading the codebase fresh each session is the workflow equivalent of computing state from the event log instead of trusting a cached projection.

Said another way: my engine's whole thesis is that LLMs generate, they do not resolve, and coordinated multiplayer state is the actual bottleneck. The same shape holds at the workflow layer. AI is the brain; automation is the body. The brain is hot-swappable. The body is the thing that has to keep working when the brain changes.

The result is that my Claude Code workflow looks a lot like my Cursor workflow. Zero memory. Fresh context from the codebase and docs each session. The discipline is the same. The tool wrapped around it is different.

What now?

I have not run a real feature session under this new setup yet. The ADRs are written. The doc reframing is done. The migration targets are tagged. I’ll know within a week whether the discipline holds in practice or whether I have just produced a very thorough rationalization for something that is going to frustrate me in different ways than the experiments did.

If it holds, this is how I work now. If it does not, I will be back in Cursor next week with Claude inside. Either way the ADRs stay. The architecture is hot-swappable around the tool, which is the whole point.

It has been a boring week. Boring is sometimes what infrastructure work looks like. Lord of the Rings helped.

Previous
Previous

Method Or Fluke

Next
Next

How I Got Started