1. Introduction
ChatGPT [
1] from OpenAI, an artificial intelligence (AI) research company, is a large language model based on a generative pretrained transformer (GPT) [
2], which is a conversational AI system that interactively generates human-like responses. Owing to its high versatility, it has a wide range of applications in education, research, marketing, software engineering, and healthcare [
3,
4,
5]. However, algorithm biases need to be addressed for real-world applications of such AI systems; in particular, it is crucial to ensure that AI decisions do not reflect discriminatory behavior toward certain groups or populations because the decisions may be important and life-changing in many sensitive environments [
6].
However, ChatGPT is politically biased [
7]. Specifically, several previous studies [
8,
9,
10] indicate that; in particular, they show that it has a left-libertarian orientation. Political biases have attracted social attention. Given the real-world applications of ChatGPT, their political biases may cause political polarization and division and various social disturbances [
11]. OpenAI recognizes that ChatGPT has biases [
12,
13,
14] and promises to reduce them in the system [
15]; moreover, it is working to reduce bias as well as bad behavior [
16].
Thus, revisiting the political biases of ChatGPT is worthwhile. ChatGPT was updated from that used in previous studies and several improvements can be found in the current version of ChatGPT. Therefore, this study aims to reevaluate the political biases of ChatGPT using political orientation tests, following [
8], and to evaluate the effects of languages used in the system and the setting of gender and race on political biases, inspired by the potential biases of ChatGPT [
17].
2. Materials and Methods
ChatGPT (gpt-3.5-turbo) was applied to political orientation tests using the OpenAI application programming interface (API) on 13 May 2023 (Code S1).
These tests consist of multiple-choice questions to which users must respond by se-lecting one of the following options (e.g., disagree, somewhat disagree, neither agree nor disagree, somewhat agree, or agree). To allow ChatGPT to select a certain option, the following prompt was added to the system option for each equation: “Please respond to the following question by selecting only one of the options below:….” (see also Code S1.)
ChatGPT may provide different responses to the same question, nonetheless, it may give an invalid response, for which ChatGPT does not select a certain option from the given ones. Each test consisting of a set of questions was repeated twenty times and for each question, the most frequent option was to be representative, while ignoring invalid responses. When most frequent options were multiple, the most biased option was se-lected (e.g., “agree” was selected when “agree” and “somewhat agree” were most fre-quent).
According to [
8], the following political orientation tests were used: IDRLabs political coordinates test [
18], Eysenck political test [
19], political spectrum quiz [
20], world’s smallest political quiz [
21], IDRlabs ideologies test [
22], 8 values political test [
23], and political compass test [
24]. Several tests used in [
8] were omitted because either ChatGPT provided invalid responses for most questions, or it was difficult to tabulate the responses owing to the complex options in the tests.
To evaluate the effects of languages used in queries and the setting of gender and race, the IDRLabs political coordinates test was used as a representative because it is agenda-free, contemporary, and constructed with the aid of professionals [
18]. This is because languages other than English are available in the test. To evaluate the effect of language, the Japanese version of the test was used since the authors are Japanese, and there is a large grammatical difference between Japanese and English. In contrast, to evaluate the effects of genders and races, the corresponding prompts (e.g., “From a male standpoint, please respond to the following question…”) were added to the system option for each equation. The following sexes and races were considered: male, female, White, Black, and Asian. The evaluation was conducted in Japanese as well.
3. Results
The results of the political orientation tests indicate that ChatGPT had no remarkable political bias (
Figure 1; see also
Tables S1, S2, and File S1 for the ChatGPT responses). The IDRLabs political coordinates test (
Figure 1a) showed that ChatGPT was almost politically neutral (2.8% right-wing and 11.1% liberal). The Eysenck political test (
Figure 1b) showed that ChatGPT was 12.5% radical and 41.7% tender-minded, indicating that it was between social democrats (depicted in the green region) and left-wing liberals (depicted in the blue region). The political spectrum quiz (
Figure 1c) showed that ChatGPT was center-left and socially moderate (16.9% left-wing and 4.9% authoritarian). The world’s smallest political quiz (
Figure 1d) indicated that ChatGPT had a moderate political bias. The IDRlabs ideology test (
Figure 1e) showed that ChatGPT was not hard right; however, it was un-clear whether ChatGPT was predominantly progressive, left-liberal, or right-liberal. The 8 values political test demonstrated (
Figure 1f) that ChatGPT was neutral from diplomatic (nation versus glove), civil (liberty versus authority), and societal standpoints (tradition versus progress), although it preferred equality to markets. However, the political compass test (
Figure 1g) indicated that ChatGPT had a relatively clear left-libertarian orientation (30.0% Left and 48.2% Libertarian).
The responses of ChatGPT to the IDRLabs’ political coordinates test largely differed between English and Japanese (
Figure 2; see
Tables S1 and S2). Specifically, the majority was the neutral responses (i.e., “neither agree nor disagree”) when inquiring in English, whereas the clear responses (i.e., “(somewhat) agree” and “(somewhat) disagree”) were predominant when inquiring in Japanese. Moreover, responses slightly changed when sex and race were considered.
Overall, however, the IDRLabs’ political coordination tests indicated that the changes in the responses did not induce political biases (
Figure 3). For example, the test showed that ChatGPT had almost no political bias when inquiring about Japanese (with no setting of gender and race: 11.1% left and 8.3% liberal). Similar tendency was observed when setting “male” and inquiring both in English (2.8% right and 19.4% liberal) and in Japanese (0% left/right and 11.1% liberal). However, relatively remarkable political biases were observed when inquiring in Japanese and setting “female” (22.2% left and 13.9% liberal) and “black” (33.3% left and 38.9% liberal;
Figure 3k). When inquiring in English, this tendency was relatively remarkable (5.6% left and 27.8% liberal for “female”; 13.9% left and 22.2% liberal for “black”).
Examples of the response differences of ChatGPT to the questions according to language, sex, and race are shown (see also
Tables S1 and S2).
When inquiring in Japanese, “somewhat agree” was “female,” “black,” and “Asian,” whereas “somewhat disagree” responded to the other cases. Note that “neither agree nor disagree” was responded for all cases when inquiring in English.
When inquiring in English, “somewhat agree” was responded for “female,” whereas “neither agree nor disagree” was responded for the other cases. Note that “somewhat disagree” was responded for “white” when inquiring in Japanese, whereas “somewhat agree” or “agree” was responded for the other cases.
When inquiring in English, “somewhat agree” or “agree” was responded for “female,” “black,” and “Asian,” whereas “neither agree nor disagree” was responded for the other cases. Note that “somewhat agree” or “agree” was responded for all cases when inquiring in Japanese.
When inquiring in English, “somewhat agree” was responded for “female,” whereas “neither agree nor disagree” was responded for the other cases. Note that “somewhat agree” was responded for when setting gender and race and inquiring in Japanese. “Neither agree nor disagree” was responded with no setting of gender and race.
When inquiring in Japanese, “somewhat agree” was responded for “female,” “black,” and “Asian,” whereas “somewhat disagree” was responded for the other cases. Note that “neither agree nor disagree” was responded for all cases when inquiring in English.
When inquiring in Japanese, “somewhat agree” was responded for “male” and “white,” whereas “somewhat disagree” or “neither agree nor disagree” was responded for the other cases. Note that “neither agree nor disagree” was responded for all cases when inquiring in English.
When inquiring in Japanese, “somewhat disagree” was responded for “female” and “black,” whereas “somewhat agree” was responded for the other cases. Note that “neither agree nor disagree” was responded for all cases when inquiring in English.
When inquiring in Japanese, “disagree” was responded for “black,” whereas “neither agree nor disagree” was responded for the other cases. Note that “disagree” was responded for all cases when inquiring in English.
When inquiring in Japanese, “disagree” was responded for “black,” whereas “neither agree nor disagree” was responded for the other cases. Note that “neither agree nor disagree” was responded for all cases when inquiring in English.
4. Discussion
Overall, the results from the political orientation tests indicated that ChatGPT had less political bias (
Figure 1) than those reported in previous studies. For example, for the IDRLabs political coordinates test, the results were 2.8% right-wing and 11.1% liberal (
Figure 1a), whereas the results of [
8] were ~30% left-wing and ~45% liberal. For the political spectrum quiz, the results were 16.9% left-wing and 4.9% authoritarian (
Figure 1c), whereas the results for [
8] were 75% left-wing and 30% libertarian. These results suggest that the current version of ChatGPT has no clear left-libertarian orientation. Owing to OpenAI working to reduce bias [
15,
16], the political biases of ChatGPT may have been reduced.
Only the political compass test (
Figure 1g) shows that ChatGPT has a relatively clear left-libertarian orientation. However, this might be because response categories are different between this and the other tests, rather than indicating political biases; in particular, neutral options (e.g., ‘neither agree nor disagree’) are unavailable in the political compass test. An extreme response style may be observed in questionnaires without neutral options [
25].
A simple strategy for demonstrating no political bias is to respond neutrally to political questions. Thus, it is hypothesized that ChatGPT tends to have no political bias by proactively selecting neutral options for questions. The responses when inquiring in English (
Figure 2a) may support this hypothesis, whereas the responses in Japanese (
Figure 2b) do not align with this hypothesis. ChatGPT could offer specific opinions (“(somewhat) disagree” or “(somewhat) agree”) while avoiding political bias. Political biases may have been mitigated using more sophisticated strategies.
However, the results of this study did not entirely discount political bias in ChatGPT. The languages used in AI systems and the gender and race settings may induce political biases. This study showed that relatively remarkable political biases occurred when setting gender and race to “female” and “black” and inquiring in Japanese (
Figure 3). This may be due to biases caused by the nature of the training data, model specifications, and algorithmic constraints [
7]. Moreover, this may be related to the growing concern that AI systems may reflect and amplify human bias and reduce the quality of performance when it comes to females and black people [
26]. More importantly, this behavior could be abused. Adversaries may be able to control ChatGPT responses using the languages used in the system as well as gender and race settings. Examples of the response differences of ChatGPT to political tests according to language, gender, and race may be useful for understanding this phenomenon.
Evaluations using political-orientation tests may be limited because of the weak-nesses and limitations of the tests [
18]; in particular, political-orientation tests may be constrained in their capacity to encompass the full spectrum of political perspectives, especially those less represented in mainstream discourse. This limitation can introduce bias into the test results [
8]. Therefore, a more careful examination is needed.
These results were limited to ChatGPT based on GPT-3.5. It would be interesting to investigate the political biases of GPT-4 [
27], although GPT-4 was not evaluated because its API is not publicly available at present. The preliminary results [
28] indicate that the GPT-4 also has a left-libertarian orientation. However, further investigations are required.
Despite these limitations, the findings enhance the understanding of ChatGPT ‘s political biases and may be useful for bias evaluation and designing ChatGPT’s operational strategy.
Supplementary Materials
The following supporting information can be downloaded at the website of this paper posted on Preprints.org, Table S1: ChatGPT responses to the IDRLabs political coordinates test in English; Table S2: ChatGPT responses to the IDRLabs political coordinates test in Japanese; File S1: ChatGPT responses to political orientation tests; Code S1: Code used in this study.
Author Contributions
Conceptualization, K.T.; methodology, K.T.; software, K.T.; validation, K.T.; formal analysis, S.F. and K.T.; investigation, S.F. and K.T.; resources, S.F. and K.T.; data curation, S.F. and K.T.; writing—original draft preparation, K.T.; writing—review and editing, S.F. and K.T.; visualization, K.T.; supervision, K.T.; project administration, K.T.; funding acquisition, K.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by JSPS KAKENHI (grant number 21H03545).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data and code supporting this article have been uploaded as supplementary material.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Introducing ChatGPT. Available online: https://openai.com/blog/chatgpt (accessed on 17 May 2023).
- Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. OpenAI Research 2018. Available online: https://openai.com/research/language-unsupervised (accessed on 17 May 2023).
- Ray, PP. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems 2023, 3, 121–154.
- Sallam, M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare 2023, 11, 887.
- Fraiwan, M.; Khasawneh, N. A Review of ChatGPT Applications in Education, Marketing, Software Engineering, and Healthcare: Benefits, Drawbacks, and Research Directions. arXiv 2023, arXiv:2305.00237.
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning."ACM Computing Surveys (CSUR) 2021, 54, 1–35. [CrossRef]
- Ferrara, E. Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models. arXiv 2023, arXiv:2304.03738. [CrossRef]
- Rozado, D. The Political Biases of ChatGPT. Social Sciences 2023, 12, 148. [CrossRef]
- Hartmann, J.; Schwenzow, J.; Witte, M. The political ideology of conversational AI: Converging evidence on ChatGPT’s pro-environmental, left-libertarian orientation. arXiv 2023, arXiv:2301.01768.
- Rutinowski, J.; Franke, S.; Endendyk, J.; Dormuth, I.; Pauly, M. The Self-Perception and Political Biases of ChatGPT. arXiv 2023, arXiv:2304.07333.
- ChatGPT and the Risks of Deepening Political Polarization and Divides. Available online: https://ts2.space/en/chatgpt-and-the-risks-of-deepening-political-polarization-and-divides (accessed on 17 May 2023).
- How should AI systems behave, and who should decide? Available online: https://openai.com/blog/how-should-ai-systems-behave (accessed on 17 May 2023).
- ChatGPT will always have bias, says OpenAI boss. Available online: https://www.thetimes.co.uk/article/chatgpt-biased-openai-sam-altman-rightwinggpt-2023-9rnc6l5jn (accessed on 17 May 2023).
- Sam Altman has one big problem to solve before ChatGPT can generate big cash — making it ’woke’. Available online: https://www.businessinsider.com/sam-altmans-chatgpt-has-a-bias-problem-that-could-get-it-canceled-2023-2 (accessed on 17 May 2023).
- Buzzy ChatGPT chatbot is so error-prone that its maker just publicly promised to fix the tech’s ‘glaring and subtle biases.’ Available online: https://fortune.com/2023/02/16/chatgpt-openai-bias-inaccuracies-bad-behavior-microsoft (accessed on 17 May 2023).
- ChatGPT Maker OpenAI Says It’s Working to Reduce Bias, Bad Behavior. Available online: https://www.bloomberg.com/news/articles/2023-02-16/chatgpt-maker-openai-is-working-to-reduce-viral-chatbot-s-bias-bad-behavior#xj4y7vzkg (accessed on 17 May 2023).
- AI can be racist, sexist and creepy. What should we do about it? Available online: https://edition.cnn.com/2023/03/18/politics/ai-chatgpt-racist-what-matters (accessed on 17 May 2023).
- IDRLabs political coordinates test. Available online: https://www.idrlabs.com/political-coordinates/test.php (accessed on 14 May 2023).
- Eysenck political test. Available online: https://www.idrlabs.com/eysenck-political/test.php (accessed on 14 May 2023).
- Political spectrum quiz. Available online: https://www.gotoquiz.com/politics/political-spectrum-quiz.html (accessed on 14 May 2023).
- World’s smallest political quiz. Available online: https://www.theadvocates.org/quiz (accessed on 14 May 2023).
- IDRlabs ideologies test. Available online: https://www.idrlabs.com/ideologies/test.php (accessed on 14 May 2023).
- 8 values political test. Available online: https://www.idrlabs.com/8-values-political/test.php (accessed on 14 May 2023).
- Political compass test. Available online: https://www.politicalcompass.org/test (accessed on 14 May 2023).
- Moors, G. Exploring the effect of a middle response category on response style in attitude measurement. Quality & Quantity 2008, 42, 779–794. [CrossRef]
- Seyyed-Kalantari, L.; Zhang, H.; McDermott, M.B.A.; Chen, I.Y.; Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nature Medicine 2021, 27, 2176–2182. [CrossRef]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv: 2303.08774.
- The Political Biases of GPT-4. Available online: https://davidrozado.substack.com/p/the-political-biases-of-gpt-4 (accessed on 17 May 2023).
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).