Back to Courses
PRO 30 min read

Lesson 3 — Test, Refine, Ship

How to take your Custom GPT from "kinda works" to "actually useful" — the testing workflow, the common failure patterns, and how to keep it good as you use it.

Course Custom Gpt · Lesson 3 of 4

Your Custom GPT is built. Now the real work: making it produce great output consistently.

Most people skip this step and end up with a GPT they used twice and forgot. The difference between a GPT that lives in your bookmarks and one you use every day is iteration.

Step 1 — The 5-test bench

Before relying on your GPT, run 5 test inputs. Pick deliberately:

  1. The easiest case — a clear, well-formed input where you’d expect great output
  2. The hardest case — a vague, messy input where most assistants would struggle
  3. An edge case — something at the boundary of what your GPT is supposed to do
  4. A “wrong scope” case — something outside what your GPT is supposed to do (does it gracefully redirect?)
  5. A real input from your life — something you’d actually use it for tomorrow

Save the outputs. You’re going to compare them to a “great output” target.

Step 2 — Score each output

For each output, score on a 1–5 scale:

If most scores are 4–5, ship it. If most are 2–3, you have work to do.

Step 3 — The 3 most common fixes

90% of Custom GPT problems fall into 3 buckets. Fix them in this order.

Problem A — Output is too generic / sounds like ChatGPT

Cause: Not enough voice samples; samples are too similar to ChatGPT’s default voice.

Fix: Add 3–5 more samples of your writing — and pick samples with personality. Avoid “professional emails” as samples; use casual writing instead. Then add a stricter style instruction: “Match the rhythm and word choice of the samples below precisely. If a phrase sounds polished but generic, replace it with something more specific to my voice.”

Problem B — It breaks your rules

Cause: Rules are buried, vague, or contradicted by the model’s defaults.

Fix: Move rules higher in the instructions (top 30%). Make them explicit and enforceable — “never use the word X” beats “be informal.” Repeat critical rules at the end of the instructions too — the model weights both ends.

Problem C — Output shape varies wildly

Cause: Format section is too vague; no example output.

Fix: Spell out the format with placeholder text. Include a worked example output that follows the format exactly. Add: “If you can’t fit my response in this format, ask me a clarifying question instead of breaking the format.”

Step 4 — Iterate using meta-prompts

The fastest way to improve a Custom GPT: ask the GPT itself how it could be better.

Open the GPT and run:

Imagine you got a request that was [the kind of thing that broke you]. Looking back at your instructions, what would you change to handle that better? Be specific — quote the lines you’d change.

You’ll get pointed feedback. Apply the changes that make sense.

You can also ask:

Read your current instructions. Where are they vague? Where would two reasonable readers disagree about what you should do? Identify the 3 weakest sentences.

This catches ambiguity you wouldn’t spot yourself.

Step 5 — The “future you” test

The real test: come back to your GPT a week later and use it on a real task. Did it just work, or did you have to keep correcting it?

If you keep correcting it the same way, that correction is missing from the instructions. Add it.

A great Custom GPT has a settled feel — you stop thinking about how to prompt it; you just type the input and the output is what you needed.

Step 6 — Ship and observe

Use it for a week. Don’t iterate every session — you’ll never settle. Just use it.

Keep a tiny note: every time the output disappoints you, jot one line about what went wrong. After a week, read your notes. Apply the most common 2–3 fixes. Don’t fix more than 3 things at once — too many simultaneous changes make it impossible to know what helped.

Step 7 — Versioning

When you make significant changes, name them. Save a copy of the previous instructions before overwriting. ChatGPT doesn’t have built-in versioning for Custom GPT instructions, so a Google Doc or note works:

v1.0 (initial) - 2026-04-01
v1.1 (added voice samples 4-7) - 2026-04-08
v1.2 (tightened "never use" list, added 'never start with "Hope this finds you well"') - 2026-04-15

If a new version performs worse than the old one, you can roll back.

How often to update?

You don’t need to update a working GPT just because you can.

Sharing or keeping private?

Custom GPTs can be:

Most personal-productivity GPTs should be private — they’re built around your voice, your context, your weird preferences. Sharing makes them generic.

If you do build a general-purpose GPT (like a “Cold Email Coach” anyone could use), the GPT Store is worth experimenting with — there’s a small revenue share program.

Common mistakes to avoid

Adding features instead of refining the core. A good Custom GPT does one job extremely well. Don’t add “and also do X.” Build a second GPT for X.

Optimizing for impressive output. Optimize for useful output. Impressive but generic loses to less-impressive but tailored.

Skipping the testing. Most people build a GPT and use it once. The compounding value is in using it 50+ times — and that requires it actually being good.

Comparing to perfection. Your GPT will never be as good as you imagine. It just needs to be better than you doing the task from scratch every time. That bar is low.

What you should have now

Where to next

You’ve finished Build Your First Custom GPT. Some good next moves:

Get the next lesson

One email a week. The 5 things in AI worth knowing.