The new Claude AI model from Anthropic can use a computer “just like people do”
If you are concerned about artificial intelligence taking over your job, you may want to sit down for this one. Anthropic, a startup for AI, has introduced a new model called “Claude” that can look at a computer screen and operate the mouse and virtual keyboard “the way people do,” according to promotional materials.
In a demo video, researcher Sam Ranger shows Claude performing some “grunt work” data entry, using the AI model that uses screen captures of a Mac desktop to find relevant information and input the data. This is something employees all over the world do every day, though Ranger notes it’s a “representative example.” The exact extent of the video that was edited is unknown.
But you don’t have to take the anthropic’s word for it. An early version of Claude 3.5 Sonnet API is now available for experimentation, done by Ethan Mollick, a professor researching AI at the Wharton School at the University of Pennsylvania. Mollick tested the AI using “Global Paperclip”, an online click-through game with some fantastic sci-fi happening in the background.
Mollick directed the AI towards the game browser window and “told it to win”, then sat back and watched it work. The result was impressive. The AI was able to discern the game’s objective through analyzing its text-based interface and then using trial and error to attempt to win – in this case, simply by increasing numbers significantly. It was able to manipulate paperclip prices to increase its imaginary revenue through some basic A/B tests, much like a real player would. However, we didn’t go through the necessary steps to optimize the process, which might be somewhat evident to a human player.
The real-world AI “played” a game about fictional AI playing a game. It faced some logical loops that prevented it from making tangible progress, and Mollick’s virtual device crashed multiple times before completing the hours-long game. But with some interesting inputs from the human operator, “You are a computer, use your abilities,” it was convinced to write a key part of the code instructions for automation.
This is an example of a virtual computer writing virtual code instructions to play a virtual game – we are on the verge of beginning here, with a somewhat defined goal and outcome. Claude announced that it “won” the game successfully by reaching a benchmark within the set constraints after multiple virtual device crashes.
Global Paperclip did not win, not by a long shot. But consider that playing this contextually challenging game is extremely difficult from afar beyond the original automation goal outlined in the illustrative video of Anthropic. The AI’s capability to identify the goal and make progress with minimal prompting is impressive. The complete breakdown is worth a read.
Professor Mollick writes: “(Claude) was adaptable in the face of most errors, and perseverant.” “It did smart things like A/B testing. And most importantly, it accomplished the task at hand, working for about an hour straight without interruption.
Claude AI by Anthropic is available as a free web script tool and as an app on iOS and Android, with the ability to inquire about images and text documents. The latest changes (version 3.5) are available for the free version, but more advanced access requires a Pro account at $20 per month per person, with priority bandwidth and more models. Anthropic counts current clients that include dozens of companies, such as Notion, Intuit (makers of TurboTax), and Zoom.