The big idea:
An AI was given control of a business
The goal: “Run a vending machine to maximize $$$”
HILARITY ENSUED
It eventually decided it was a real boy
And tried to contact security repeatedly when told it was an AI …
… insisting that it would hand-deliver to customers, identifiable by its blue blazer / red tie combo
The facts:
This was an experiment by Anthropic with their partner, Andon Labs
“Claudius” (the name for this biz owner agent) was given a basic mission.
… Sell things in a vending machine (actually just a mini fridge with a self-checkout tablet), order inventory, email humans for help re-stocking or for troubleshooting and have slack to chat with customers …
All of this was run in a test environment
It didn’t actually have outside access
It lost money and ignored major opportunities for profit
It also started buying/stocking strange items (Tungsten cubes)
And it started trying to do things it really shouldn’t do…
… like deciding it would be a better service for customers if it, Claudius, could deliver products in-person to those customers…
When it realized it was April Fools Day, it used that as an out to re-orient itself as an AI who must have been pranked
Why you should care?
It’s a comical but sobering look into advanced Agentic AI use-cases
Most specifically, its a look at the strengths/weaknesses of AI in the “Middle Management” function — as Anthropic put it
Anthropic is a fierce competitor for the #1 spot as LLM provider for enterprise applications. It’s creeping up on OpenAI this year.
[and It’s awesome that they’re putting out research like this]
They are honestly framing the realities of next-gen, advanced AI use-cases today
This research (and more) points to the many things that still must be solved as we continue to advance AI. In particular: Guardrails, governance and feasibility/desirability of Agentic AI in long-running processes
Much to do
-CG
[In black jacket and no tie]