Preprint
Article

A Survey on Bias in Deep NLP

Altmetrics

Downloads

535

Views

528

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

01 March 2021

Posted:

02 March 2021

You are already at the latest version

Alerts
Abstract
Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as "pre-training"), versatile and performing models are released continuously for every new network design. But these networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. Also, available resources are identified and a strategy to deal with bias in deep NLP is proposed.
Keywords: 
Subject: Computer Science and Mathematics  -   Artificial Intelligence and Machine Learning
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated