Article
Version 1
This version is not peer-reviewed
AI vs. Human: Decoding Text Authenticity with Transformers
Version 1
: Received: 24 July 2024 / Approved: 25 July 2024 / Online: 25 July 2024 (07:29:49 CEST)
How to cite: Gifu, D.; Silviu-Vasile, C. AI vs. Human: Decoding Text Authenticity with Transformers. Preprints 2024, 2024072014. https://doi.org/10.20944/preprints202407.2014.v1 Gifu, D.; Silviu-Vasile, C. AI vs. Human: Decoding Text Authenticity with Transformers. Preprints 2024, 2024072014. https://doi.org/10.20944/preprints202407.2014.v1
Abstract
In an era where the proliferation of large language models blurs the lines between human and machine-generated content, discerning text authenticity is paramount. This study investigates transformer-based language models—BERT, RoBERTa, and DistilBERT—in distinguishing human-written from machine-generated text. By leveraging a comprehensive corpus, including human-written text from sources such as Wikipedia, WikiHow, various news articles in different languages, and texts generated by OpenAI's GPT-2, we conduct rigorous comparative experiments. Our findings highlight the superior effectiveness of ensemble learning models over single classifiers in this critical task. This research underscores the versatility and efficacy of transformer-based methodologies for a wide range of natural language processing applications, significantly advancing text authenticity detection systems. The results demonstrate competitive performance, with the transformer-based method achieving an F-score score of 0.83 with RoBERTa-large (monolingual) and 0.70 with DistilBERT-base-uncased (multilingual).
Keywords
large language models; natural language processing; content creation; text authenticity
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment