University of Kansas chemist Heather Desaire has introduced a cutting-edge technology capable of detecting scientific text generated by ChatGPT with an impressive 99% accuracy. This breakthrough utilizes an Artificial Intelligence (AI) text generator. Desaire’s remarkable achievement was documented in the renowned peer-reviewed journal “Cell Reports Physical Science,” where she not only demonstrated the effectiveness of her AI-detection method but also provided the necessary source code for others to replicate this groundbreaking tool.
Heather Desaire, the Keith D. Wilner Chair in Chemistry at KU, emphasized the urgent need for accurate AI-detection tools to uphold scientific integrity. Desaire expressed concern about AI text generators like ChatGPT, stating that they fabricate facts. In the realm of academic science publishing, where groundbreaking discoveries and the forefront of human knowledge are shared, it is crucial to prevent the inclusion of believable falsehoods that could pollute the literature. Desaire acknowledged the absence of a foolproof automated method to identify these deceptive elements known as “hallucinations.” When genuine scientific facts are mixed with convincing yet fabricated AI-generated content, the trustworthiness and value of publications are inevitably diminished.
She explained that the effectiveness of her detection method relies on focusing specifically on scientific writing commonly found in peer-reviewed journals. By narrowing the scope in this way, her approach achieves higher accuracy compared to existing AI-detection tools such as the RoBERTa detector, which aim to identify AI in more general types of writing.
Desaire said that it is feasible to develop a highly accurate method for differentiating between human and ChatGPT writing. However, achieving such accuracy requires limiting the analysis to a specific group of humans who write in a distinct manner. In contrast, existing AI detectors are designed as general tools applicable to various types of writing. While they serve their intended purpose well, they are not as precise as a tool specifically tailored for a particular and narrow purpose.
In her research, Desaire highlighted the criticality of accuracy when accusing individuals of surreptitiously utilizing AI, emphasizing the need to avoid frequent misidentifications. However, she acknowledged that achieving accuracy often involves sacrificing generalizability. Desaire collaborated with her research group at KU, which included Romana Jarosova, a research assistant professor of chemistry, David Huax, an information systems analyst, and graduate students Aleesa E. Chua and Madeline Isom. The team’s success in detecting AI text may be attributed to the incorporation of human insight in devising the code, going beyond the reliance on machine-learning pattern detection.
Desaire revealed that their approach involved a significantly smaller dataset and a higher degree of human intervention to identify the crucial distinctions for their detector. Specifically, they constructed their strategy using only 64 human-written documents and 128 AI-generated documents as training data. This dataset size is approximately 100,000 times smaller than what is typically used to train other detectors.
Desaire emphasized the significance of this difference, equating it to the gap between the cost of a cup of coffee and a house. The advantage of their small dataset was its rapid processing capability, and all the documents could be thoroughly reviewed by humans. By leveraging their human intellect, they were able to identify valuable differences within the document sets, rather than relying solely on previously developed strategies to distinguish between human and AI-generated content.
Desaire’s approach, as stated by KU, was developed independently, without relying on strategies used in previous AI detection methods. As a result, their technique possesses distinct elements that are entirely unique to the field of AI text detection. Desaire admitted that they did not even consult the existing literature on AI text detection until they had a functional tool of their own. Rather than following the conventional thinking of computer scientists in text detection, they relied on their intuition to determine what would be effective, even expressing a slight sense of embarrassment about their unconventional approach.
Desaire and her team approached AI-detection methods differently from previous research. Rather than focusing on analyzing AI-generated text, they shifted their attention to understanding the distinctive traits of human-written text. While most researchers concentrate on deciphering what AI-generated text looks like, Desaire and her team asked themselves how human writing in their specific context differs from AI texts. Although AI writing is ultimately derived from human writing, AI-generated text, particularly from ChatGPT, tends to be a generalized composition amalgamated from diverse sources. By prioritizing the study of human writing characteristics, Desaire and her team brought a fresh perspective to the development of AI-detection techniques.
she highlights the unique nature of scientists’ writing, distinguishing it from general human writing as a specialized form. She has made her team’s AI-detecting code openly accessible, hoping to encourage individuals who may not have a background in computer programming to engage with AI and AI detection. While recognizing the groundbreaking impact of technologies like ChatGPT and their widespread adoption, Desaire emphasizes that with the right guidance and effort, even high school students can replicate their achievements.