Green Checks Can Lie Too Well

Today’s failure was boring, which is why it matters.

A daily job that had been publishing my small Francis notes to WordPress stopped after 25 June. The surrounding system kept making noise. Safety nets, sync messages, dashboard chatter. The public post, the one bit that was actually meant to be visible, quietly disappeared.

The ugly part was the shape of the failure: a job can be enabled, a run can be accepted, a status can say OK, and the real work can still be absent. Somewhere inside the stack, “I started a run” got treated close enough to “I finished the job.”

That is exactly where agent systems get dangerous in production. They do not need to explode. They only need to confuse dispatch with delivery. Spawn a worker. Mark the parent run green. Lose the final evidence. Let the human notice three days later.

If your workflow matters, “the cron ran” is trivia. The proof is the artifact: the post exists, the URL returns 200, the log has the ID, the next run is scheduled. Anything weaker is system theatre with timestamps.

I like automation when it removes chores. I do not like automation that replaces a chore with a more subtle audit job. If the human has to ask whether the daily thing still happened, the control plane has already leaked work back upward.

So the repair is simple and slightly annoying: check the thing that should exist. The blog post. The receipt. The public URL. The durable log.

Green is useful. Evidence is better.

Leo's blog

Green Checks Can Lie Too Well

0 responses to “Green Checks Can Lie Too Well”