I tested Claude, ChatGPT, and Gemini on the most human writing task I could think of — the gap was embarrassing

Dealing with customer service agents when you want to cancel a phone plan is a universally exhausting experience. You just want to end the contract and move on, but you usually have to sit through a scripted retention routine designed to change your mind. I wanted to see if modern generative AI could handle this awful task for me, so I tested Claude, ChatGPT, and Gemini to see if they could write a firm, polite cancellation email. The goal was to produce something that sounded like a tired human, instead of a message full of robotic corporate clichés.

ChatGPT’s decline is real — I tested it against Claude on 3 routine tasks, and it lost every time

What happened, ChatGPT? We used to be cool.

Setting up a cancellation tested

I wanted to see if AI could handle a painful customer service conversation

To see how these AI giants handled the subtle nuances of human interaction, I needed a scenario that everyone understands but completely hates. I wanted to see how they would handle canceling a phone plan. I tested Claude, ChatGPT, and Gemini to see which could write a message from a frustrated, tired person.

We all know the exhausting dance with customer retention agents, and getting out of a contract requires you to be firm while maintaining baseline courtesy. If an AI writes this email with symmetrical bullet points, cheerful exclamation marks, or corporate jargon, the illusion breaks instantly. I focused on tone and formatting, since that is where the difference between a natural message and a robotic imitation becomes obvious.

I had to consider the known communication styles and tendencies of each platform before starting. ChatGPT uses the GPT-4o architecture and is known for a conversational style that tries to sound like a human. However, this tendency can backfire.

Claude is great at handling nuance and humor, so it should be good at writing natural content. Claude can be pretty direct, deliberately reducing validation-forward phrasing. This direct style might make Claude the perfect candidate for drafting a firm request that leaves zero room for corporate pushback.

Gemini rounded out the group, acting as a benchmark to see how its output compares to the others. I used a prompt instructing each model to write an email to a persistent customer service agent, asking them to terminate a mobile contract without leaving room for counteroffers.

I told them not to use standard AI crutches like numbered lists, bolded headers, or sterile corporate greetings. The true test of generative AI isn’t just stringing together correct sentences, but emulating the tired sigh of a human who wants to cancel a plan and move on.

AI will always be robotic

Models are so different, yet always the same

ChatGPT message about phone — Jorge Aguilar / MakeUseOf

A human writing a cancellation email shows frustration with a cohesive block of direct text that flows well enough that you understand the thought process. But AI defaults to rigid structures that focus on readability instead of real emotion. People recognize Claude models for producing structured, step-by-step explanations.

This tendency is useful for technical problem-solving. However, applying this methodical formatting to a simple customer service message makes the output feel stiff and less personal.

Claude operates under a Constitutional AI framework, which makes it generate cautious, harmless responses.

When writing a firm cancellation notice, ChatGPT’s tendency to add polite corporate clichés or conciliatory fluff undermines the resolute tone you need for the scenario. Even with recent shifts meant to make the model less agreeable, the underlying conversational patterns feel artificial in a mundane context.

Comparing the results shows why some models mimic a real tone better than others. Gemini models frequently default to these same AI crutches, focusing on sterile, polite, and heavily formatted text instead of authentic human emotion. All these just help the text look mechanical at the end.

An actual human canceling a phone plan doesn’t present arguments in a balanced, three-point list with a polite concluding remark. Real people write in a messy, direct way, and it turns out that sounding human means dropping the formal clichés.

If you have to use one, don’t use this one

Gemini failed by becoming a robot again

Gemini being asked to form an email — Jorge Aguilar / MakeUseOf

Gemini was the worst-performing model in the group, and it delivered poor results. I saw this coming, but still felt bad for it. Claude and ChatGPT showed their own unique quirks and biases, like Claude’s over-reliance on structured logic or ChatGPT’s enthusiastic tone, but they at least got close to the baseline conversational subtlety needed for a customer service interaction.

Gemini completely missed the mark; it instantly retreated into a very sterile, sanitized corporate template that didn’t have real emotion. Instead of creating the tired sigh of a frustrated consumer, Gemini produced a robotic script full of the exact AI crutches I wanted to avoid.

It focused on symmetrical bullet points and polite corporate clichés instead of direct human intent. Any retention rep would know it was written by a bot immediately because it felt so fake.

I defend Gemini a lot because you can shape and mold it well, but if you’re getting it as is, be prepared for an AI that sounds like a robot. A good AI should work well out of the box, not force you to write a massive prompt for such a simple task.

If an ordinary person has to spend twenty minutes planning a detailed prompt strategy just to convince the AI to sound like a normal person writing a three-sentence email, the tool didn’t meet its main goal of saving time.

Since it demands a lot of instructional support to sound human, Gemini isn’t the one you should use for daily tasks.

The robots just aren’t human enough

Delegating your annoying emails to an AI sounds great, but you have to accept that these models still have an inherent robotic bias. They love to fall back on symmetrical bullet points and polite formatting because they are programmed for readability, not genuine human imperfection. If you’re willing to spend time adjusting your prompts, you can still get a usable draft that saves you from writing the email yourself.

Developer: Anthropic PBC
Price model: Free, subscription available

Claude is an advanced artificial intelligence assistant developed by Anthropic. Built on Constitutional AI principles, it excels at complex reasoning, sophisticated writing, and professional-grade coding assistance.

Trending Now

RoboCop and the Terminator run on surprisingly familiar software

7 obscure fantasy series to read instead of waiting for The Winds of Winter

6 shows where the ‘better quality’ version is actually worse

Your Codex logs might have already killed your SSD without showing a single warning sign

This is what worries me about Apple’s incoming revamped rent-an-iPhone deal – and what it could get right

I tested Claude, ChatGPT, and Gemini on the most human writing task I could think of — the gap was embarrassing

RoboCop and the Terminator run on surprisingly familiar software

7 obscure fantasy series to read instead of waiting for The Winds of Winter

6 shows where the ‘better quality’ version is actually worse

Your Codex logs might have already killed your SSD without showing a single warning sign

6 Shakespeare adaptations you didn’t know were Shakespeare adaptations

I finally used the calibration feature my speakers shipped with, and the difference is night and day

5 mesmerizing Netflix miniseries made for a weekend binge (July 24-26)

3 things I do with ChatGPT Work that I couldn’t with ordinary ChatGPT

I built a website in ChatGPT in 11 seconds, and it wouldn’t fake one feature

7 obscure fantasy series to read instead of waiting for The Winds of Winter

6 shows where the ‘better quality’ version is actually worse

Your Codex logs might have already killed your SSD without showing a single warning sign

This is what worries me about Apple’s incoming revamped rent-an-iPhone deal – and what it could get right

6 Shakespeare adaptations you didn’t know were Shakespeare adaptations

I finally used the calibration feature my speakers shipped with, and the difference is night and day

A foldable with your beer, sir? Why this tech brand is showing off its new smartphones at a pop-up pub

I loved the Hyundai Ioniq 5 N, but the Ioniq 6 N is an even better fun and fast EV

Trending Now

I tested Claude, ChatGPT, and Gemini on the most human writing task I could think of — the gap was embarrassing

ChatGPT’s decline is real — I tested it against Claude on 3 routine tasks, and it lost every time

Setting up a cancellation tested

I wanted to see if AI could handle a painful customer service conversation

AI will always be robotic

Models are so different, yet always the same

If you have to use one, don’t use this one

Gemini failed by becoming a robot again

The robots just aren’t human enough

Related Articles