Code, click, research: ChatGPT's triple agents transforms AI assistance

OpenAI's trio of specialised agents Codex, Operator, and Deep Research, bring expert-level capabilities to different domains
ChatGPT users now have access to three powerful AI agents that dramatically expand what the platform can do.
ChatGPT users now have access to three powerful AI agents that dramatically expand what the platform can do.(Image: EdexLive Desk)
Published on: 

ChatGPT users now have access to three powerful Artificial Intelligence (AI) agents that dramatically expand what the platform can do: a coding assistant that thinks like a developer, a digital helper that can navigate websites for you, and a research tool that delivers analyst-quality reports.

According to Live Mint, OpenAI launched its third AI agent of the year on Friday, May 16, with the release of Codex, a specialised tool designed to function like a software engineer. While the earlier launches, Deep Research and Operator, target broader audiences, Codex specifically empowers both experienced developers and coding beginners with sophisticated programming capabilities.

Codex

Codex operates as a comprehensive software engineer within ChatGPT, capable of handling multiple programming tasks simultaneously. The specialised agent can write new features, debug existing code, and answer questions about a user's codebase, with each task running in its own sandbox, an environment for privacy and security.

What sets Codex apart is its foundation on OpenAI's advanced o3 reasoning model, specifically optimised for software engineering. The model underwent reinforcement learning on real-world coding scenarios across various environments, enabling it to generate code that closely mirrors human programming styles and preferences, follow instructions with precision, and iteratively test solutions until achieving successful results.

Operator

Operator leverages OpenAI's Computer-Using Agent (CUA) model, combining GPT-4o's visual capabilities with enhanced reasoning abilities from more advanced models. This agent excels at breaking complex tasks into manageable steps and self-correcting when it encounters obstacles.

A standout feature of Operator is its ability to interact naturally with graphical user interfaces, clicking buttons, navigating menus, and entering text in fields. Working within a dedicated browser, Operator can execute tasks independently while users focus elsewhere. The agent accepts both text and image inputs for versatile task management.

Unlike conventional AI assistants, Operator analyses screen pixels directly and interacts using virtual keyboard and mouse inputs within a controlled sandbox environment.

Deep Research

Deep Research harnesses OpenAI's latest o3 reasoning model, optimised specifically for web browsing and data analysis. This agent searches, interprets, and analyses extensive collections of text, images, and PDFs from across the internet to produce comprehensive reports comparable to those created by professional research analysts.

Unlike standard ChatGPT searches, Deep Research queries require between 5 and 30 minutes to complete, with users receiving notifications when their research is ready. This additional processing time enables the agent to synthesise what would normally take hundreds of hours of human research into concise, insightful reports.

Related Stories

No stories found.
X
logo
EdexLive
www.edexlive.com