The big idea:

  • An AI was given control of a business

  • The goal: “Run a vending machine to maximize $$$”

  • HILARITY ENSUED

  • It eventually decided it was a real boy

  • And tried to contact security repeatedly when told it was an AI …

  • … insisting that it would hand-deliver to customers, identifiable by its blue blazer / red tie combo

The facts:

  • This was an experiment by Anthropic with their partner, Andon Labs

  • “Claudius” (the name for this biz owner agent) was given a basic mission.

  • … Sell things in a vending machine (actually just a mini fridge with a self-checkout tablet), order inventory, email humans for help re-stocking or for troubleshooting and have slack to chat with customers …

  • All of this was run in a test environment

  • It didn’t actually have outside access

  • It lost money and ignored major opportunities for profit

  • It also started buying/stocking strange items (Tungsten cubes)

  • And it started trying to do things it really shouldn’t do…

  • … like deciding it would be a better service for customers if it, Claudius, could deliver products in-person to those customers…

  • When it realized it was April Fools Day, it used that as an out to re-orient itself as an AI who must have been pranked

Why you should care?

  • It’s a comical but sobering look into advanced Agentic AI use-cases

  • Most specifically, its a look at the strengths/weaknesses of AI in the “Middle Management” function — as Anthropic put it

  • Anthropic is a fierce competitor for the #1 spot as LLM provider for enterprise applications. It’s creeping up on OpenAI this year.

  • [and It’s awesome that they’re putting out research like this]

  • They are honestly framing the realities of next-gen, advanced AI use-cases today

  • This research (and more) points to the many things that still must be solved as we continue to advance AI. In particular: Guardrails, governance and feasibility/desirability of Agentic AI in long-running processes

  • Much to do

-CG
[In black jacket and no tie]

Keep Reading

No posts found