Mingqi Hou

How I Use Cursor in Production (Without Losing Ownership of the Code)

Model choice, context discipline, rules, MCP, and why AI helps more on legacy code than greenfield demos—lessons from seven years of shipping web apps.

I have used Cursor as a daily driver for over a year while shipping production React and Node.js work. This is not a tool review—it is what actually moved the needle on speed, quality, and debuggability when clients care about zero regressions.

Pick the model on purpose

The model is the floor under everything else. For serious code generation and multi-file edits I default to Claude (Sonnet-class and up). Benchmarks matter less than consistency: structure, refactors, and following constraints across a long thread.

Two practical traps:

  1. Auto mode when quota is low — Cursor may route you to a weaker model. Fine for chat; bad for edits. I upgraded my plan partly to avoid silent downgrades during delivery weeks.
  2. Context pressure — Watch the context meter. Near the limit, switch to a longer-context model or Max Mode so you do not lose file state mid-refactor.

Show the code, not the vibe

Cursor works best when you treat it like a senior pair who cannot see your screen unless you attach it.

Debug with runtime evidence

Static analysis misses race conditions, env drift, and “works on my machine.” My loop:

  1. Ask Cursor to add narrow console.log (or structured logs) on the failing path.
  2. Reproduce and copy full output.
  3. Paste logs + relevant code back into chat.

That often fixes issues that three blind patch attempts did not. You are giving the model observations, not just syntax.

Rules beat repeating yourself

I got tired of saying “prefix debug logs with [checkout-debug]” every session. Cursor Rules encode team norms once:

Per-project rules scale when you freelance across stacks.

MCP when copy-paste breaks flow

Manual log copying is friction. Browser MCP (or Playwright MCP for non-Chrome) lets the agent read console output and DOM state with less garbling than screenshots of collapsed objects.

I also wire official docs where models hallucinate APIs—e.g. Ant Design’s llms.txt in Cursor docs so component answers match current props.

Stack choice still matters

Models are trained on what is public. React + TypeScript + Tailwind tends to be high quality. Niche cross-end frameworks (Taro, uni-app) are “AI-poor”: fix H5, break mini-program, fix RN, break H5 again. For client work I bias toward stacks with strong OSS signal unless the business mandates otherwise.

The counter-intuitive part: 0→1 is easy, 1→100 is hard

Demos love greenfield apps. Paid work is mostly legacy: unclear names, hidden business rules, fear of touching the 500-line useEffect. AI shines when you onboard it like a new hire:

What I still own as the engineer

AI widens the gap between strong and weak contributors on the same team.

Ownership — I read the agent diff summary and every line before push. If I cannot explain a change in a standup, it does not ship.

Taste — “Runs and passes QA” is not enough. A list rendered as four separate rows (image row, title row, price row, button row) “works” but fails component design. I still read good open source and refactor toward reuse.

Breadth — “Parse Excel on the frontend” goes faster when I say “use xlsx, return JSON of sheet 1”—twenty lines vs two hundred of fragile hand-rolled parsing.

Precision — Vague prompts get vague diffs. Specific inputs, outputs, and files are the difference between a useful agent run and a lottery.

Where this is heading

Slapping “+AI” on every step of an old SDLC is like early smartphones that kept physical call buttons. The interesting shift is re-wiring the loop—agents that retrieve, verify, and ship with observability—not faster copy-paste of the same process.

If you hire me on Upwork, you get someone who already lives in that loop on real codebases, not only in tutorial repos.