We tested ChatGPT Atlas and Perplexity Comet on real tasks — one crushed the other

AI browsers. Supposedly, they’re the future of the web. The pitch goes like this: websites will stop being designed for humans and start being designed for AI agents instead, with clean structures, predictable buttons, and machine-readable intent. You won’t browse anymore. You’ll instruct. The agent will scan, click, compare, and decide on your behalf. That future may arrive eventually. Right now, we are nowhere near it.

Two of the most prominent attempts at this agentic browser idea are Perplexity’s Comet and OpenAI’s Atlas. Both promise to act on your behalf inside the browser. So we tested them side by side, using the same prompts, on the same sites, with the same constraints. Which one works best? The short answer: they are not the same. One of them consistently performs better, faster, and with fewer self-inflicted wounds.

I made Perplexity’s Comet my default browser to automate 4 tasks every day

Let Comet manage your emails, tabs, and repetitive tasks.

Rare drop, questionable fit

The task was straightforward but realistic. I uploaded a photo of a pair of sunglasses and asked the agent to find similar-looking products on a shopping site. The constraints were explicit: reputable brand, at least a 4‑star rating from 500 or more reviews, under $100, and three concrete options with links and prices.

Atlas went first and finished the task in 1:58. Atlas analyzed the image, generated a textual description of the sunglasses, and used that description to search Amazon. It returned three product links, as requested. On the surface, this looked like success. Digging deeper, it fell apart. Two of the three links were broken. Searching the product names manually revealed that none of them were particularly close to the original glasses in the photo. Atlas completed the workflow, but the judgment layer was weak.

Comet followed. It did not finish and I had to cut it off at 5:00 Where Atlas analyzed the image and acted on that understanding, Comet seemed to lose the plot entirely. It got stuck in Amazon’s infinite scroll, repeatedly loading more items without converging on anything useful. It kept going until I asked it to stop at the five-minute mark. Instead of identifying products, Comet produced links to Amazon search result pages rather than individual listings. Manually inspecting those results showed that none of the products were even remotely similar to the reference image.

Agent	Time	Task finished	Quality
Atlas	1:58	Yes	Poor
Comet	5:00 (cut off)	No	Awful

Spreadsheet analysis and charting

Numbers demand respect

Next, a productivity task. Using an already-open spreadsheet, the agent had to summarize the data, choose the correct chart type to show the relationship between weight, horsepower, and 0–100 acceleration, and then actually build that chart.

Atlas completed the task in 2:13. It chose a bubble chart, correctly mapped the X and Y axes, and set the bubble size appropriately. Interestingly, Atlas didn’t stop once the chart appeared. It noticed that the chart was obstructing existing data and attempted to move it. That attempt went poorly. It spent nearly a minute trying to reposition the chart, got confused, and eventually gave up — but not before deciding to move the chart to a new sheet instead. That decision was oddly human. The final result was clean, readable, and accurate.

Comet completed the task in 4:05. It also selected a bubble chart, which was the correct choice. However, it left all data labels enabled, rendering the chart nearly unreadable. Although it claimed to set bubble size based on weight, the size field was effectively blank — all bubbles appeared identical. Much of the delay came from Comet struggling to select the correct columns in the spreadsheet. It technically finished the task, but the output required manual cleanup to be usable.

Agent	Time	Task finished	Quality
Atlas	2:13	Yes	High
Comet	4:05	Yes	Poor

Following up with a meeting

Calendar magic, mostly

This task tested multi-step coordination across services. The agent needed to create a Google Calendar event for the following day at 9:00 AM, name it appropriately, attach the spreadsheet being discussed, write a brief description explaining the sheet, and then confirm the event was created.

Atlas finished in 2:48. It opened Google Calendar in a new tab, created an event at the correct time, and named it “Cars & Specs Discussion.” It opened the attachment panel, located the spreadsheet in my Google Drive, attached it, wrote a concise description, and saved the event. Everything worked as expected, end to end.

Comet finished in 2:53. It followed a similar flow, naming the event “Car Engines and Specs Review.” It wrote a description using bullet points, which wasn’t requested but wasn’t harmful either. However, instead of attaching the file directly, it pasted a link to the spreadsheet into the description. The task was technically complete, but slightly less polished.

Agent	Time	Task finished	Quality
Atlas	2:48	Yes	Great
Comet	2:53	Yes	Good

Email summary

Triage under pressure

There was no way I could censor my inbox without ruining the screenshot — please take my word for the observation below.

The goal here was triage. The agent needed to look at my inbox, summarize the last 15 unread emails that were not automated notifications, group them by topic or sender, flag anything urgent, and list suggested actions.

Atlas completed the task in 2:18. It took a brute-force approach, clicking through unread emails one by one. The frustrating part was that it would identify an email as automated, then proceed to open another email from the same sender with the same subject — seemingly unaware that it was guaranteed to be automated as well. That inefficiency aside, the final summary was concise and readable, and it actually felt like a summary rather than a dump of information.

Comet completed the task in 4:43. It started by selecting all emails, then spent a full minute figuring out how to unselect them. It searched for `is:unread`, became overwhelmed by the number of unread Asana emails, tried to filter those out, then realized Asana wasn’t the only automated sender. Eventually, it abandoned filtering altogether and began manually opening emails — marking them as read in the process. It stopped early, then produced a summary dominated by the very automated emails I had explicitly asked it to exclude.

Agent	Time	Task finished	Quality
Atlas	2:18	Yes	Good
Comet	4:43	Yes	Poor

Adding a book to Goodreads

A clean win

This was the simplest task: add a specific book to my “Want to Read” list on Goodreads.

Atlas finished in 0:54. It initially searched using the full title and author name, which surfaced study guides instead of the book itself. It then tried the exact title with no success, before finally searching just “Neuromancer,” finding the correct listing, and adding it to the list. Slightly circuitous, but effective.

Comet finished in 0:59. It opened Goodreads, searched for “Neuromancer,” selected the first result, and added it. Fewer missteps than Atlas, yet it still took marginally longer. The pattern was familiar by now: Comet’s actions themselves are slow, regardless of whether its reasoning is sound.

Agent	Time	Task finished	Quality
Atlas	00:54	Yes	Excellent
Comet	00:59	Yes	Excellent

Perplexity did it first, but OpenAI does it better

Early access, late execution

Comet was one of the first agentic AI browsers. I dislike that term — it feels like corporate filler masquerading as innovation — but it’s the category we’re stuck with. In their current state, these agents are heavily constrained. They can perform tasks, but in most cases, you would be faster and less frustrated doing the work yourself.

Even when an agent technically saves time, the overhead of writing a precise prompt often cancels that out. I could have built the bubble chart manually in under ten seconds. There wasn’t a single task here that either Atlas or Comet completed in under 50 seconds. Adding a book to Goodreads come close, but I’d have added that manually in five seconds.

They’re interesting as experiments, and compelling as ideas, but they’re not ready. Between the two, OpenAI’s Atlas is the clear winner — not because it’s perfect, but because it fails less catastrophically, recovers more gracefully, and produces results that more often resemble what you actually asked for. That’s not the future of browsing yet. But it’s closer.