A machine learning-based tool separates human writers from artificial intelligence writers based on stylistic characteristics.
According to a study published on November 6 in Cell Reports Physical Science1, a machine-learning method can quickly identify when chemistry papers are authored using the chatbot ChatGPT. The specialized classifier could assist academic publishers in identifying papers produced by AI text generators, as it outperformed two artificial intelligence (AI) detectors now in use.
Co-author Heather Desaire, a chemist at the University of Kansas in Lawrence, states that the majority of the text analysis community “wants a really general detector that will work on anything.” However, “we were really going after accuracy” by creating a tool that concentrates on a specific kind of paper.
The results imply that software tailored to particular writing styles could enhance efforts to create AI detectors, according to Desaire. “It’s not that hard to build something for different domains if you can build something quickly and easily.”
The elements of style
In June, Desaire and her associates published the first description of their ChatGPT detector, using Perspective articles from the journal Science2. The detector uses machine learning to identify if a text was written by ChatGPT or an academic scientist by looking at 20 aspects of writing style, such as how frequently specific phrases and punctuation are used and sentence length variations. According to Desaire, the results demonstrate that “you could use a small set of features to get a high level of accuracy.”
The most recent study used the first parts of publications from eleven chemistry journals issued by the American Chemical Society (ACS) to train the detector. The introduction was selected by the team because, according to Desaire, ChatGPT can produce this part of a paper quite easily if it has access to background material. After using 100 published introductions as human-written text to train their tool, the researchers instructed ChatGPT-3.5 to produce 200 introductions in the format of an ACS journal. The tool received the titles of 100 of these publications, and the abstracts of the remaining 100 papers.
The program demonstrated 100% accuracy in identifying ChatGPT-3.5-written sections based on titles when evaluated on human-written introductions and AI-generated ones from the same journals. The accuracy for the introductions that ChatGPT generated using abstractions was 98%, which was a little lower. With text produced by ChatGPT-4, the most recent iteration of the chatbot, the tool performed equally well. In contrast, depending on the version of ChatGPT utilized and whether the introduction was created from the paper’s title or abstract, the AI detector ZeroGPT could only identify AI-written introductions with an accuracy of roughly 35–65%. The ChatGPT creator, OpenAI, also built a text-classifier tool that did not perform well, only being able to identify AI-written introductions with an accuracy of about 10–55%.
Even with introductions from journals it wasn’t trained on, the new ChatGPT catcher did well. It could also capture AI text generated by a variety of prompts, including one meant to trick AI detectors. For journal articles published in scientific journals, the system is quite specialized. It was unable to identify genuine articles from university newspapers as having been written by humans.
Wider issues
Debora Weber-Wulff, a computer scientist at the HTW Berlin University of Applied Sciences who specializes in academic plagiarism research, calls what the writers are doing “something fascinating.” Instead of focusing on writing style characteristics, she claims that many current technologies attempt to identify authorship by looking for predictive text patterns of AI-generated work. “I had not considered utilizing stylometrics on ChatGPT.”
However, Weber-Wulff notes that the usage of ChatGPT in academics is motivated by other factors. She points out that a lot of researchers may feel pressured to publish papers rapidly or they may not think of the writing process as a crucial component of science. These problems won’t be solved by AI detection technologies, thus they shouldn’t be viewed as “a magic software solution to a social problem.”