The Harness, Not the Agent

Lance Eaton wrote a careful piece on agentic AI in education. He names a real tension. Agentic tools cut friction. Learning needs friction. He’s right to worry.

He’s also looking at the wrong layer.

The interesting thing isn’t Claude Code itself (although it’s a fun toy). It’s what you put around Claude Code.

I run a harness. Hooks fire on every tool call. Skills load when the work matches them. Memory persists across sessions. A separate evaluator agent grades the work after every deploy. The agent doesn’t get to mark its own homework.

This sounds like sysadmin trivia, but it really isn’t. It’s the whole pedagogy.

Eaton catalogs tasks. Twelve hundred PDFs sorted. Small apps built. Video chunked into timestamps. Useful. I do those things too. But task completion is table stakes.

Here’s what sits on top.

A hook blocks git push –force and suggests –force-with-lease. A hook stops a bare pip install outside a venv. A hook auto-runs the matching test file the second I save a source file. A hook nags me for acceptance criteria before I let an agent build new automation.

None of this came from a vendor. I built it. Each rule is a mistake I made once and refused to make twice.

That’s where the friction lives, and it’s where the learning happens: building the rails for your agent. Making the information that it produces more meaningful, and more meaningful to you.

The agent is building a memory of me; a modern elaboration of the digital twin concept. Sixty-some files: User profile, project state, feedback I gave it last month it would otherwise forget. Every time I tell it something non-obvious, it writes the note down. Every time it acts on stale memory, it gets corrected and updates the file.

This is a lab notebook. The lab notebook happens to be the agent’s working set.

Students should build one of these. Not download one. Build it.

The evaluator is the part most people skip.

When I deploy a script to my Mac Mini, the agent doesn’t tell me it worked. It runs a separate skill called /verify. /verify has its own criteria file. It SSHs in, reads logs, checks cron, parses real output. It reports PASS or FAIL.

The builder and the grader are different agents on purpose. Builders over-praise their own work. So do students. So do all of us. Split the two and the system gets honest.

This is the move Eaton’s piece is reaching for and doesn’t quite name. The friction isn’t gone. If you pay attention closely, you’ll notice it just moved. It lives in the evaluator now.

So what does this mean for a classroom?

First, we need to stop teaching prompts. Prompts are typing.

Instead, teach the harness. Teach hooks: what should be impossible to do by accident. Teach skills: what’s worth codifying as a reusable move. Teach memory: what’s worth remembering across sessions, what isn’t. Teach the evaluator: how do you know it actually worked.

A student who can answer those four questions can use any agent. They’ll understand more deeply how the thing itself works, and in doing so, begin to become an owner of it, rather than it owning them.

Eaton ends with a tentative hope. Maybe agentic AI can cut organizational friction so students focus on real learning. Maybe.

But I’d push it further. The harness is the learning. Building it is how you understand what an agent actually is. What it can be trusted with. Where it lies. Where it forgets. When to override it. You don’t get that from a prompt library. You get it from writing the hook that fires when the agent does the thing you told it not to.

We should be teaching the harness.

The Harness, Not the Agent

Related posts:

Five Favorite Bike Rides in Rockbridge County

OmniFocus Google Chrome Plugin

Related posts:

Five Favorite Bike Rides in Rockbridge County

OmniFocus Google Chrome Plugin

Subscribe to the Newsletter

Oh hi there 👋It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

Oh hi there 👋
It’s nice to meet you.