Study Reveals AI-Generated Texts Frequently Repeat Certain Words

Study Reveals AI-Generated Texts Frequently Repeat Certain Words

This post may contain affiliate links that allow us to earn a commission at no expense to you. Learn more

A recent study has uncovered that AI-generated texts frequently repeat certain words, raising questions about their authenticity and effectiveness in professional contexts.

Short Summary:

  • AI-generated texts often contain repetitive word usage.
  • Perplexity measures help evaluate text prediction accuracy in AI models.
  • Research indicates significant AI involvement in scientific publications.

In an era where artificial intelligence is stepping into various aspects of human life, content creation hasn’t been spared. Recent findings reveal a pattern in AI-generated texts that may clue humans into their digital origin. Repetitive word usage is a common marker, making text produced by AI models, like ChatGPT, appear less original and sometimes even monotonous.

Perplexity, as a technical term, gauges the predictive capability of language models. Lower perplexity means the AI model is more adept at predicting the next word in a sentence. High perplexity, however, can indicate an unexpected or nonsensical output. For instance, the phrase “The Eiffel Tower stands tall in Berlin” would baffle trained AI, reflecting high perplexity—it’s unexpected given known data.

“AI tools seem to over-rely on certain words,” comments James Zou from Stanford University. This trend was highlighted during peer reviews of studies submitted to AI conferences, where the term “meticulous” appeared 35 times more often than before.

Andrew Gray, a librarian at University College London, has noted an uptick in AI-generated academic texts. Analyzing five million scientific studies from last year, Gray found an unusual rise in words like “meticulously”, “intricate”, and “commendable”. His hypothesis? Researchers are using AI tools to either write or polish their studies.

Several blatant examples stand out. A Chinese study on lithium batteries, published in an Elsevier journal, began with: “Certainly, here is a possible introduction for your topic: Lithium-metal batteries are promising candidates for…”. This accidental verbatim copy from ChatGPT shows how AI assistance can sometimes slip through without proper editing.

Gray estimates that over 60,000 scientific papers in 2023 alone were likely authored with the help of AI. While he acknowledges that most cases involve using AI to refine English or correct typos, there’s a substantial gray area of concern.

“Right now it is impossible to know how big this gray area is because scientific journals do not require authors to declare the use of ChatGPT; there is very little transparency,” Gray laments.

Moreover, a recent survey in the journal Nature revealed that one in three scientists admitted to using AI tools like ChatGPT to aid their writing. The rise in verbs such as “delve” in medical studies, now appearing in over 0.5% of texts compared to less than 0.04% previously, illustrates this AI influence.

As AI becomes more prevalent, it’s crucial to explore how it phrases content. Texts often employ cohesive devices—linking phrases and transition words—to create coherence. These include terms like “therefore”, “consequently”, and “moreover”. While useful in moderation, overuse can make the writing appear formulaic and clunky.

For instance, ChatGPT might generate an essay on rainforest destruction with monotonic phraseology. It may use repetitive structures such as: “This destructive practice not only destroys the forest canopy but also disrupts the fragile ecosystem, jeopardizing the survival of countless plant and animal species.” Such redundancy not only weakens the argument but also alerts readers to its AI origin.

“Generative AI tools lack the human touch needed for contextual understanding,” adds researcher Ángel María Delgado Vázquez. Especially for non-native English speakers, ChatGPT is a double-edged sword—helpful but not foolproof.

Regarding transparency and proper usage, more needs to be done. One approach is deploying AI-detection tools that could flag repetitive or out-of-context text, alerting readers or editors to possible AI usage. However, these tools themselves are in constant evolution.

There are broader implications for society as well. Andrew Gray warns about “a vicious circle”, where successive versions of AI are trained on increasingly AI-generated texts, creating a loop of artificial language proliferation. This could insidiously alter the way we communicate, making it harder to distinguish between authentic and AI-generated content.

Jeremy Nguyen from Swinburne University highlights, “I actually find myself using ‘delve’ lately in my own language—probably because I spend so much time talking to GPT.”

So, what’s the solution? Editors, researchers, and writers should take meticulous care in distinguishing AI assistance from human creativity. Conducting thorough edits and employing AI-detection tools will become vital steps in maintaining the integrity of written content.

Tools like WordRake offer a useful hand here. Configured to both identify and simplify confusing AI-generated text, WordRake can streamline and humanize digital drafts. Key edits may include reducing redundancy and replacing awkward phrases with simpler alternatives.

“If you struggle with certain parts of writing, AI tools can help you get unstuck. But it can’t write chunks of text for you to adopt as your own without your editing or investigation,” advises Ivy B. Grey, Chief Strategy & Growth Officer for WordRake.

In conclusion, the age of AI in content creation brings both immense opportunities and complex challenges. Mindful editing, combined with understanding AI’s limitations, can leverage its advantages while preserving the authenticity and coherence of human expression. Future endeavors should focus on enhancing transparency and developing more sophisticated detection methodologies to keep the delicate balance between AI and human creativity.


Photo of author
Author
SJ Tsai
Chief Editor. Writer wrangler. Research guru. Three years at scijournal. Hails from a family with five PhDs. When not shaping content, creates art. Peek at the collection on Etsy. For thoughts and updates, hit up Twitter.

Leave a Comment