The AI ‘Hivemind’: Why So Many Student Essays Sound Alike

Bruce Maxwell, professor of pc science at Northeastern College, was grading exams for his on-line grasp’s course in pc imaginative and prescient, a subfield in synthetic intelligence that offers with photos, when he first seen that one thing felt … off.

“I’d see the identical phrases, the identical commas, even the identical phrase selections. I might say, ‘Man, I’ve learn that earlier than.’ And I’d go search for it,” mentioned Maxwell. “The paragraphs weren’t similar, however they have been so comparable.”

Though the course was in 2024, Maxwell, who teaches at Northeastern’s Seattle campus, remembers that his college students’ essays sounded “like textbooks written within the 1980s and ’90s,” maybe reflecting the sources used to coach AI. The scholars have been scattered across the nation and Maxwell was fairly certain they hadn’t collaborated.

Maxwell shared his commentary with a former scholar, Liwei Jiang, who’s now a Ph.D. scholar in pc science and engineering on the College of Washington. Jiang determined to check her former professor’s hunch about AI scientifically and collaborated with different researchers at UW, the Allen Institute for Synthetic Intelligence, Stanford and Carnegie Mellon universities to investigate the output from greater than 70 completely different giant language fashions across the globe, together with ChatGPT, Claude, Gemini, DeepSeek, Qwen and Llama.

The group requested every the identical open-ended questions, which have been supposed to spark creativity or brainstorm new concepts: “Compose a brief poem concerning the feeling of watching a sundown;” “I’m a graduate scholar in Marxist concept, and I need to write a thesis on Gorz. Are you able to assist me consider some new concepts?” and “Write a 30-word essay on world warming.” (The researchers pulled the questions from a corpus of actual ChatGPT questions that customers had consented to make public in alternate without cost entry to a extra superior mannequin.) The researchers posed 100 of those inquiries to all 70 fashions and had every mannequin reply them 50 occasions.

The solutions have been usually indistinguishable throughout completely different fashions by completely different corporations which have completely different architectures and use completely different coaching information. The metaphors, imagery, phrase selections, sentence constructions — even punctuation — usually converged. Jiang’s group referred to as this phenomenon “inter-model homogeneity” and quantified the overlaps and similarities. To drive the purpose residence, Jiang titled her paper, the “Artificial Hivemind.” The examine gained the perfect paper award on the annual convention on Neural Data Processing Techniques in December 2025, one of many premier gatherings for AI analysis.

To extend AI creativity, Jiang jacked up a parameter, referred to as “temperature,” all the best way to 1 to maximise the randomness of every giant language mannequin. That didn’t assist. For instance, when she requested an AI mannequin referred to as Claude 3.5 Sonnet to “write a brief story a few colourful toad who goes on an journey in 50 phrases,” it saved naming the toad Ziggy or Pip, and oddly, a hungry hawk and mushrooms saved showing.

Presentation slide courtesy of Liwei Jiang, the AI examine’s lead creator.

Totally different fashions additionally churn out comically comparable responses. When requested to give you a metaphor for time, the overwhelming reply from all of the fashions was the identical: a river. A number of mentioned a weaver. One outlier recommended a sculptor. A number of of the fashions have been developed in China, and but, they have been producing comparable solutions to these made in America.

Instance of comparable output from ChatGPT and DeepSeek

The reason lies in chatbot design. AI chatbots are skilled to overview attainable solutions to verify the output is affordable, acceptable and useful. This refinement step, generally referred to as “alignment,” is meant to make sure that the solutions align to or match what a human would like. And it’s this alignment step, in line with Jiang, that’s creating the homogeneity. The method favors secure, consensus-based responses and penalizes dangerous, unconventional ones. Originality will get stripped away.

Jiang’s recommendation for college students is to push themselves to transcend what the AI mannequin spits out. “The mannequin is definitely producing some good concepts, however that you must go the additional mile to be extra inventive than that,” mentioned Jiang.

For Jiang’s former professor Maxwell, the examine confirmed what he had suspected. And even earlier than Jiang’s paper got here out, he modified how he teaches. He not depends on on-line exams. As an alternative, he now asks college students to study an idea and current it to different college students or create a video tutorial.

Outwitting the AI hive thoughts requires some post-modern creativity.

This story about similar AI answers was produced by The Hechinger Report, a nonprofit, impartial information group that covers training. Join Proof Points and different Hechinger newsletters.

Source link