Studio notes — reliability before automation

This week the agent fleet was at work for a courier company, a training organisation and an insurance broker. Three sectors, one recurring lesson: measure and make things reliable before you automate anything.

Every week, the same fleet of agents is at work across several clients at once. This week: a courier company, a training organisation, and an insurance broker. Three unrelated trades, but one lesson kept resurfacing from file to file — measure and make things reliable before you automate anything.

Sorting the signal before answering faster

At a courier company, the operational inbox was receiving several hundred messages a day, mixing internal coordination, exchanges with partner carriers, and client requests. The overwhelming majority of those messages were never opened — the inbox behaved like an archive rather than a working channel, and barely a handful of messages out of twenty were genuine client requests: an urgent pickup, a proof of delivery, a complaint. Before writing a single sorting rule, the team started by measuring that flow — who writes, about what, how often. That measurement then became the basis for categorising every message automatically as it arrives, surfacing the requests that actually matter. Lesson for a business owner: an inbox that has been 'running itself' for years often hides a completely unmanaged channel — AI email automation is only as good as the diagnosis that precedes it.

Turning a compliance file into a guided journey

At a training organisation, the challenge was building a compliance file covering dozens of indicators, each one backed by documentary evidence. An agent now walks the user through the process step by step, reading the documents and images uploaded at each stage to check they meet the required standard, while an admin panel tracks several files in parallel, resumes an interrupted session, and keeps control over access. What looked like an administrative chore becomes a guided journey that the subject-matter expert no longer needs to supervise in real time. Lesson: regulatory compliance — a quality audit, a certification, any standard built around indicators — is exactly the kind of process where an AI agent in production adds the most value, because its rules are already written down.

Reliability before the next feature

At an insurance broker, the agent that compares existing policies against real needs relies on automatically reading PDF documents sent in by clients. An edge case showed up under real conditions — one file format was silently breaking text extraction — and the fix meant swapping out a technical building block rather than papering over the symptom. At the same time, the team also simplified the architecture — fewer dependencies, several files handled at once instead of one at a time — rather than bolting on new options. Lesson for a CIO: an AI agent in production is judged first on whether it stays silent when a document doesn't come through cleanly, and only then on its feature set — one less moving part is usually worth more than one more feature.