Last Sunday a coach opened her laptop and went looking for the AI tool she bought in the spring. Took her a minute to find it. New tab, the old login, the “AI team” she’d been so sure about in March — the one she’d told two friends about over dinner.

She fed it a real task. A client email she actually had to send.

She read what came back. Then she did the thing I keep watching capable people do: she closed the tab, opened a blank box, and typed the email herself. Same as she’d have done in March. Same as she’d done in February.

Six months. Same prompts. Same outputs. And somewhere in there, without deciding to, she’d quietly gone back to doing it by hand.

That’s the test. I call it the Month-Six Test, and it’s the only AI diagnostic I trust, because it can’t be gamed by a feature list.

Here it is. Open the AI thing you bought six months ago — not the shiny one you started last week, the one you’ve actually lived with. Use it on real work. Then answer one question, honestly, with nobody watching: is it sharper than the day you bought it, or exactly the same?

If it’s exactly the same, you didn’t buy a system. You bought a pile.

Direction is the whole game

Spent 35 years building systems for banks — the risk engines, the regulatory reporting, the treasury infrastructure where being wrong on a Tuesday costs millions by Friday. You learn one thing fast in that work: a system that cannot measure itself cannot improve, and a system that cannot improve only moves one direction over time. Down. Quietly. While everyone assumes it’s fine because it ran clean the day it shipped.

The Month-Six Test isn’t measuring features. It’s measuring direction. The test reads the direction. That’s all it does.

Most people fail it and feel a small, specific discomfort. Hold onto that. It’s the most useful thing you’ll feel all week.

“Exactly the same” is the diagnosis, not the verdict on you

Here’s the part I need you to actually hear, because it’s where most people quietly blame themselves and get it backwards.

“Exactly the same after six months” is not a sign you used the tool badly. It’s a sign the tool was built with no way to get better. A static pile cannot pass this test — not because you were lazy with it, but by construction. It has no measurement. It has no memory of your corrections. It has nothing to improve from. You ran a fair test on it and it failed for reasons that have nothing to do with your discipline.

That wasn’t your fault. It was the way the thing was built.

I watched a generation of capable people get handed “AI tools” with none of the rigor my career demanded — and then blame themselves when the output came out flat and stayed flat. The tool wasn’t built like a tool. It was built like a demo. A demo is dazzling on day one and frozen forever after. That’s a different design spec entirely.

The two questions hiding underneath

Direction is hard to feel directly, so it hides in two smaller habits you can check right now.

The first is correction. The last time you fixed something your AI got wrong, did the fix stick — or are you going to make the exact same correction again next week, and the week after that? A system remembers. A pile forgets by morning, and quietly hands you the same mistake to catch forever.

The second is trust. Would you put its output in front of your best client without reading every line first? If the honest answer is no — if you still proofread it like a nervous intern’s first draft — then six months of use bought you a thing you supervise, not a thing you rely on.

Sit with those for a second before you read on. Sharper or same. Sticks or forgets. Trusted or supervised. If they landed uncomfortably, that’s not a discipline problem and it’s not a you problem. It’s the gap between a pile and a system.

Stop — this counts.

What it feels like to pass

A system that passes does three things a pile cannot. It measures its own output against a standard you set. It keeps every correction you make instead of forgetting it by morning. And it shows you what changed and why. That’s the whole shape of a workflow that improves itself — your corrections accumulate into capability instead of evaporating overnight, so month six leaves you with something genuinely sharper than month one.

The coach with the blank box didn’t have a discipline problem. She had a pile. Once she saw that — once she stopped grading herself and started grading the tool — the next decision got a lot simpler. She wasn’t fighting her own willpower. She was watching a static thing drift while she stood still beside it.

So go run the test. Open the oldest one. Use it on something real. Score it: sharper, same, or worse. Write the score down.

That number is your baseline. The day a tool beats it is the day you own a system instead of feeding a pile.

Frequently asked questions

What is the Month-Six Test? Open the AI tool you bought six months ago, use it on real work, and answer one question honestly: is it sharper than the day you bought it, or exactly the same? If it is exactly the same, you bought a pile, not a system.

My AI is exactly the same after six months — did I use it wrong? No. Exactly-the-same is a sign the tool was built with no way to get better, not a sign you were undisciplined. A static pile cannot pass the test by construction: it has no measurement and no memory of your corrections to improve from.

What does passing the Month-Six Test look like? A system that passes measures its own output against a standard you set, keeps every correction instead of forgetting it by morning, and shows you what changed and why — so month six leaves you with something genuinely sharper than month one.

Excelsior,

Pierre Founder, CurioChat

P.S. If you ran the test and it came back “exactly the same,” don’t throw the tool out in frustration — frustration isn’t the lesson. The lesson is the design spec. You now know precisely what to demand of the next thing you buy: not more features, but a direction. Something that’s measured, that you own, and that’s sharper in month six than it was on day one — because you used it. Then run the test one more time, pointed the other way — at yourself. Are you a little more able to do the work without the tool than you were six months ago, or a little less? The right system makes both answers point the same way: sharper tool, sharper owner. The wrong one quietly trades the second for the first — the tool gets smarter so you don’t have to, and you end up outsourced to yourself without noticing.