News Warner Logo

News Warner

Google’s latest AI model uses a web browser like you do

Google’s latest AI model uses a web browser like you do

  • Google has released Gemini 2.5 Computer Use, an AI model that uses a web browser to navigate and interact with interfaces designed for humans.
  • The model uses “visual understanding and reasoning capabilities” to analyze user requests and carry out tasks, such as filling out forms or playing games.
  • Gemini 2.5 Computer Use is available to developers through Google AI Studio and Vertex AI, and can be demoed on Browserbase.
  • The model has been shown to outperform leading alternatives in multiple web and mobile benchmarks, but currently only supports 13 actions, including opening a browser or typing text.
  • Google’s announcement comes as OpenAI revealed new apps for ChatGPT and Anthropic released a version of its Claude AI model with computer use capabilities last year.

Google is previewing a new Gemini AI model designed to navigate and interact with the web via a browser, letting AI agents do things inside interfaces designed for use by people and not robots. The model, called Gemini 2.5 Computer Use, uses “visual understanding and reasoning capabilities” to analyze a user’s request and carry out a task, such as filling out and submitting a form.

It can be used for UI testing or navigating interfaces made for people who don’t have an API or other direct connection available. Other versions of this model have been used for agentic features in AI Mode and Project Mariner, a research prototype that uses AI agents to carry out tasks on its own in a browser, like adding items to your cart based on a list of ingredients.

Google’s announcement comes just one day after OpenAI revealed new apps for ChatGPT as part of its annual Dev Day, and continues to focus its attention on its ChatGPT Agent feature that can complete complex tasks on your behalf. Meanwhile, Anthropic had already released a version of its Claude AI model with “computer use” last year. 

Google posted some demo videos showing its computer use tool in action, and notes that they are sped up 3x. 

Google says its computer use model “outperforms leading alternatives on multiple web and mobile benchmarks.” Unlike ChatGPT Agent and Anthropic’s computer use tool, Google’s new AI model only has access to a browser — not an entire computer environment. Google notes that it shows “it is not yet optimized for desktop OS-level control” and currently supports 13 actions, including opening a web browser, typing text, as well as dragging and dropping elements.

Gemini 2.5 Computer Use is available to developers through Google AI Studio and Vertex AI, but there’s also a demo on Browserbase, where you watch as it completes tasks, like “Play a game of 2048” or “Browse Hacker News for trending debates.”

link

Q. What is Google’s latest AI model designed for?
A. Google’s latest AI model, called Gemini 2.5 Computer Use, is designed to navigate and interact with the web via a browser.

Q. How does Gemini 2.5 Computer Use work?
A. Gemini 2.5 Computer Use uses “visual understanding and reasoning capabilities” to analyze a user’s request and carry out a task, such as filling out and submitting a form.

Q. What is the purpose of Gemini 2.5 Computer Use?
A. The model can be used for UI testing or navigating interfaces made for people who don’t have an API or other direct connection available.

Q. How does Gemini 2.5 Computer Use compare to other AI models like ChatGPT Agent and Anthropic’s computer use tool?
A. Google’s new AI model outperforms leading alternatives on multiple web and mobile benchmarks, but it only has access to a browser – not an entire computer environment.

Q. What actions can Gemini 2.5 Computer Use currently support?
A. The model currently supports 13 actions, including opening a web browser, typing text, as well as dragging and dropping elements.

Q. Where can developers access Gemini 2.5 Computer Use?
A. Gemini 2.5 Computer Use is available to developers through Google AI Studio and Vertex AI.

Q. Is Gemini 2.5 Computer Use optimized for desktop OS-level control?
A. No, Google notes that it shows “it is not yet optimized for desktop OS-level control”.

Q. How does the demo of Gemini 2.5 Computer Use work?
A. The demo on Browserbase allows users to watch as the model completes tasks, such as playing a game of 2048 or browsing Hacker News.

Q. What is the significance of Google’s announcement about Gemini 2.5 Computer Use?
A. The announcement comes just one day after OpenAI revealed new apps for ChatGPT and continues to focus on its ChatGPT Agent feature that can complete complex tasks on your behalf.

Q. How does Gemini 2.5 Computer Use differ from other AI models in terms of its capabilities?
A. Unlike other AI models, Gemini 2.5 Computer Use only has access to a browser – not an entire computer environment.