1. Introduction
In December 2019, the world was challenged by the outbreak of COVID-19. COVID-19 is a coronavirus (CoV), specifically or otherwise known as SARS-CoV-2, that causes severe respiratory illness [
1]. For some individuals, this respiratory illness may even cause acute respiratory distress syndrome or extra-pulmonary organ failure due to the extreme inflammatory response of the virus [
2]. The dangers of COVID-19 resulted in a global search for understanding as to how this virus works and is contracted. SARS-CoV-2, named such because of its similarities with SARS-CoV, is believed to initially be contracted by humans through animal-human contact and thereafter spread through human-human contact [
3]. This is not unusual, as other CoVs, like MERS-CoV, were transmitted through animal-human contact [
4]. While COVID-19 is believed to have originated in a seafood market in Wuhan, China, the exact animal that may have infected the first identified patients remains unclear [
3]. However, genome sequencing between SARS-CoV-2 and SARS-CoV in a bat from Yunnan Province, China suggests that bats may have been the originators of SARS-CoV-2 due to the CoVs having a 93.1% identity with the RaTG12 virus found in the bat [
1].
After the initial outbreak, COVID-19 soon spread to different parts of the world and on March 11, 2020, the World Health Organization (WHO) declared COVID-19 an emergency [
5]. As no treatments or vaccines for COVID-19 were available at that time, the virus rampaged unopposed across different countries, infecting and leading to the demise of people the likes of which the world had not witnessed in centuries. As of September 21, 2023, there have been a total of 770,778,396 cases and 6,958,499 deaths due to COVID-19 [
6]. As an attempt to mitigate the spread of the virus, several countries across the world went on partial to complete lockdowns [
7]. Such lockdowns affected the educational sector immensely. Universities, colleges, and schools across the world were left searching for solutions to best deliver course content online, engage learners, and conduct assessments during the lockdowns. During this time, online learning was considered a feasible solution. Online learning platforms are applications (web-based or software) that are used for designing, delivering, managing, monitoring, and accessing courses online [
8]. This switch to online learning took place in more than 100 countries [
9] and led to an incredible increase in the need to familiarize, utilize, and adopt online learning platforms by educators, students, administrators, and staff at universities, colleges, and schools across the world [
10].
In today’s Internet of Everything era [
11], the usage of social media platforms has skyrocketed as such platforms serve as virtual communities [
12] for people to seamlessly connect with each other. Currently, around 4.9 billion individuals worldwide actively participate in social media, and it is projected that this number will reach 5.85 billion by 2027. On average, a social media user maintains approximately 8.4 social media profiles and allocates roughly 145 minutes each day to engage with various social media platforms. Among the various social media platforms available, Twitter has gained substantial popularity across diverse age groups [
13,
14]. This rapid transition to online learning resulted in a tremendous increase in the usage of social media platforms, such as Twitter, where individuals communicated their views, perspectives, and concerns towards online learning, leading to the generation of Big Data of social media conversations. This Big Data of conversations holds the potential to provide insights about these paradigms of information and seeking behavior about online learning during COVID-19.
1.1. COVID-19: A Brief Overview
COVID-19 is a type of coronavirus (CoVs). CoVs are a type of RNA virus consisting of four proteins: spike (S) protein, membrane (M) protein, envelope (E) protein, and nucleocapsid (N) protein. The S protein is involved with the attachment and recognition of the host cell (infection); the M protein is involved with shaping virions; the E protein is responsible for packaging and reproduction, and the N protein packages RNA into a nucleocapsid. The virions also have polyproteins that are translated after entry into the host or target cell. These polyproteins include 1a and b (pp1a, pp1b) [
15,
16]. The SARS-CoV-2 virus particle measures between 60 to 140 nanometers in diameter and boasts a positive-sense, single-stranded RNA genome spanning a length of 29891 base pairs [
15].
Infection by SARS-CoV-2 occurs when the S protein binds to the surface receptor, angiotensin-converting enzyme 2 (ACE2), and enters type II pneumocytes, which are found in human lungs. The S protein is critical to transmission and infection by SARS-CoV-2, as the S protein has two domains, S1 and S2, where S1 involves binding of ACE2 and S2 involves fusion to the host cell at its membrane. Similarly important is the cleavage of the S protein. Having two cleavage sites, the S protein must be cleaved by nuclear proteases so that viral entry, and subsequent infection, of the host cell can happen. Previous research suggests that the S protein of SARS-CoV-2 has a higher binding efficiency and may explain its high rate of transmissibility. The high transmissibility is also explained by four amino acids found during insertion, P681, R682, R683, and A684, that have not been found in other CoVs before, nor was it found in the RaTG12 virus observed in the bat thought to have infected the first human patients of COVID-19 [
17,
18].
While infections involving various organs have been documented in different cases, the typical effect of the SARS-CoV-2 virus on patients is centered around their respiratory systems. Investigation of the infections caused in Wuhan in December 2019 has shown that patients suffer from a range of symptoms during the initial days of contracting this virus. These symptoms encompass fever, a dry cough, breathing difficulties, headaches, dizziness, fatigue, nausea, and diarrhea. It’s important to note that the symptoms of COVID-19 can vary from person to person both in terms of the nature of the symptoms as well as the intensity of one or more symptoms [
19,
20].
1.2. Twitter: A Globally Popular Social Media Platform
Twitter ranks as the sixth most popular social platform in the United States and the seventh globally [
21,
22]. Notably, 42% of Americans between the ages of 12 and 34 are active Twitter users, marking a substantial 36.6% surge over the span of two years. The frequency of posting on the platform appears to correlate with the number of accounts followed, as users who post more than five Tweets per month tend to follow an average of 405 accounts, in contrast to those who post less frequently and follow an average of 105 accounts [
22]. Furthermore, users spend an average of 1.1 hours per week on Twitter, which equates to 4.4 hours per month on the platform [
23].
In 2023, Twitter boasts 353.9 million monthly active users, constituting 9.4% of the global social media user base [
24]. The majority of Twitter users, accounting for 52.9%, fall within the age range of 25 to 49 years. Notably, 17.1% of users belong to the 14-18 age group, 6.6% to the 13-17 age group, and the remaining 17% are aged 50 and above [
25]. On average, U.S. adults spend approximately 34.1 minutes per day on Twitter [
26]. Impressively, a staggering 500 million tweets are published each day, equivalent to 5,787 tweets per second. An encouraging statistic reveals that 42.3% of U.S. users utilize Twitter at least once a month and it is currently the ninth most visited website globally. The countries with the highest number of Twitter users include the United States with 95.4 million users, Japan with 67.45 million, India with 27.25 million, Brazil with 24.3 million, Indonesia with 24 million, the UK with 23.15 million, Turkey with 18.55 million, and Mexico with 17.2 million [
28,
29]. On average, a Twitter user spends 5.1 hours per month on the platform, translating to approximately 10 minutes daily. A fifth of users under 30 visit frequently, and 25% use the platform every week, with 71% visiting at least weekly. Twitter is a significant source of news, with 55% of users accessing it regularly for this purpose. Ninety-six percent of U.S. Twitter users report monthly usage. Additionally, 82% engage with Twitter for entertainment. In terms of activity, 6,000 tweets are sent per second. Mobile usage is dominant, with 80% of active users accessing Twitter via smartphones [
30,
31].
Due to this ubiquitousness of Twitter, studying the multimodal components of information-seeking and sharing behavior has been of keen interest to scientists from different disciplines as can be seen from recent works in this field that focused on the analysis of tweets about various emerging technologies [
32,
33,
34,
35], global affairs [
36,
37,
38], humanitarian issues [
39,
40,
41], and societal problems [
42,
43,
44]. Since the outbreak of COVID-19, there have been several research works conducted in this field (
Section 2) where researchers analyzed different components and characteristics of the tweets to interpret the varying degrees of public perceptions, attitudes, views, and responses towards this pandemic. However, the tweeting patterns about online learning during COVID-19, with respect to the gender of twitter users, have not been investigated in any prior work in this field.
1.3. Gender Diversity on Social Media Platforms
Gender differences in content creation online have been comprehensively studied by researchers from different disciplines [
45] as such differences have been considered important in the investigation of digital divides that produce inequalities of experience and opportunity [
46,
47]. Analysis of gender diversity and the underlying patterns of content creation on social media platforms has also been widely investigated [
48]. However, the findings are mixed. Some studies have concluded that males are more likely to express themselves on social media as compared to females [
49,
50,
51], while others found no such difference between genders [
52,
53,
54]. The gender diversity related to the usage of social media platforms has varied over the years in different geographic regions [
55]. For instance,
Figure 1 shows the variation in social media use by gender from the findings of a survey conducted by the Pew Research Center from 2005 to 2021 [
56].
In general, most social media platforms tend to exhibit a notable preponderance of male users over their female counterparts, for example – WhatsApp [
57], Sina Weibo [
58], QQ [
59], Telegram [
60], Quora [
61], Tumblr [
62], Facebook, LinkedIn, Instagram [
63], and WeChat [
64]. Nevertheless, there do exist exceptions to this prevailing trend. Snapchat has male and female users accounting for 48.2% and 51%, respectively [
65]. These statistics about the percentage of male and female users in different social media platforms are summarized in
Table 1. As can be seen from
Table 1, Twitter has the highest gender gap as compared to several social media platforms such as Instagram, Tumblr, WhatsApp, WeChat, Quora, Facebook, LinkedIn, Telegram, Sina Weibo, QQ, and SnapChat. Therefore, the work presented in this paper focuses on the analysis of user diversity-based (with a specific focus on gender) patterns of public discourse on Twitter in the context of online learning during COVID-19.
The rest of this paper is organized as follows. In
Section 2, a comprehensive review of recent works in this field is presented.
Section 3 discusses the methodology that was followed for this work. The results and scientific contributions of this study are presented and discussed in
Section 4. It is followed by
Section 5, which summarizes the contributions of this study and outlines the scope of future research in this area.
4. Results and Discussion
This section presents and discusses the results of this study. As stated in
Section 3, Algorithm 2 was run on the dataset to detect the gender of each Twitter user. After obtaining the output from this algorithm, the classifications were manually verified as well and the ‘maybe’ labels were manually classified as either male, female, or none. Thereafter, the dataset contained only three labels for the “Gender” attribute – male, female, and none.
Figure 3 shows a pie chart-based representation of the same. As can be seen from
Figure 3, out of the tweets posted by males and females, males posted a higher percentage of the tweets.
The results obtained from Algorithm 3 are presented next.
Figure 4 presents a pie chart to show the percentage of tweets in each of the sentiment classes (positive, negative, and neutral) by taking all the genders together. As can be seen from this Figure, the percentages of positive, negative, and neutral tweets as per VADER were 41.704%, 29.932%, and 28.364%, respectively.
Next, for each sentiment class (positive, negative, and neutral) the distribution in terms of tweets posted by males, females, and twitter accounts assigned a none gender was calculated. The results of the same are shown in
Figure 5,
Figure 6 and
Figure 7.
As can be seen from
Figure 5,
Figure 6 and
Figure 7, for each sentiment label (positive, negative, and neutral) between males and females, males posted a higher percentage of tweets. A similar analysis was performed by applying Algorithm 4 and 5 on the dataset. The results of applying Algorithm 4 are presented in
Figure 8,
Figure 9,
Figure 10 and
Figure 11.
As can be seen from
Figure 8, the percentage of positive tweets (as per the Afinn approach for sentiment analysis) was higher than the percentage of negative and neutral tweets. This is consistent with the findings from VADER (presented in 4). Furthermore,
Figure 9,
Figure 10 and
Figure 11 show that for each sentiment label (positive, negative, and neutral) between males and females, males posted a higher percentage of the tweets. This finding is also consistent with the results obtained from VADER as shown in
Figure 5,
Figure 6 and
Figure 7. The results of applying TextBlob for performing sentiment analysis are shown in
Figure 12,
Figure 13,
Figure 14 and
Figure 15.
From
Figure 12, it can be inferred that, as per Afinn, the percentage of positive tweets was higher as compared to the percentage of negative and neutral tweets. This is consistent with the results of VADER (
Figure 4) and Afinn (
Figure 8). Furthermore,
Figure 13,
Figure 14 and
Figure 15 show that for each sentiment class (positive, negative, and neutral), between males and females, males posted a higher percentage of the tweets. Once again, the results are consistent with the observations from VADER (
Figure 5,
Figure 6 and
Figure 7) and Afinn (
Figure 9,
Figure 10 and
Figure 11). In addition to sentiment analysis, TextBlob also computed the subjectivity of each tweet and categorized each tweet as highly opinionated, least opinionated, or neutral. The results of the same are shown in
Figure 16,
Figure 17,
Figure 18 and
Figure 19.
As can be seen from
Figure 16, more than a majority of the tweets were least opinionated. To add to this,
Figure 17,
Figure 18 and
Figure 19 show that for each subjectivity class (i.e. highly opinionated, least opinionated, and neutral), between males and females, males posted a higher percentage of the tweets. The results obtained from Algorithm 6 are discussed next. This algorithm analyzed all the tweets and categorized them into one of toxicity classes - toxicity, obscene, identity attack, insult, threat, and sexually explicit. The number of tweets that were classified into each of these classes was 36081, 8729, 3411, 1165, 18, and 4, respectively. This is shown in
Figure 20. Thereafter, the percentage of tweets posted by each gender for each of these categories of toxic content was analyzed and the results are presented in
Figure 21,
Figure 22,
Figure 23,
Figure 24,
Figure 25 and
Figure 26.
From
Figure 21,
Figure 22,
Figure 23 and
Figure 24, it can be seen that for the classes – toxicity, obscene, identity attack, and insult, between males and females, males posted a higher percentage of the tweets. Furthermore,
Figure 25 shows that there wasn’t any tweet from females that was assigned a threat label.
Figure 26 shows that for those tweets that were categorized as sexually explicit, between males and females, females posted a higher percentage of those tweets. It is worth mentioning here that the results of
Figure 25 and
Figure 26 are based on data that constitutes less than 1% of the tweets present in the dataset. So, in a real-world scenario, these percentages could vary when a greater number of tweets are posted for each of the two categories – threat and sexually explicit.
In addition to analyzing the varying trends in sentiments and toxicity, the content of the underlying tweets was also analyzed using word clouds. For generation of these word clouds the top 100 words (in terms of frequency were considered). To perform the same, a consensus of sentiment labels from the three different sentiment analysis approaches was considered. For instance, to prepare a word cloud of positive tweets, all those tweets that were labeled as positive by VADER, Afinn, and TextBlob were considered. A word cloud was developed to represent the same. Thereafter, for all the positive tweets, gender-specific tweeting patterns were also analyzed to compute the top 100 words used by males for positive tweets, the top 100 words used by females for positive tweets, and the top 100 words used by twitter accounts associated with a none gender label. A high degree of overlap in terms of the 100 words for all these scenarios was observed. More specifically, a total of 79 words were common amongst the lists of the top 100 words for positive tweets, the top 100 words used by males for positive tweets, the top 100 words used by females for positive tweets, and the top 100 words used by Twitter accounts associated with a none gender label. So, to avoid redundancy,
Figure 27 shows a word cloud-based representation of the top 100 words used in positive tweets. Similarly, a high degree of overlap in terms of the 100 words was also observed for the analysis of different lists for negative tweets and neutral tweets. So, to avoid redundancy,
Figure 28 and
Figure 29 show word cloud-based representations of the top 100 words used in negative tweets and neutral tweets, respectively. In a similar manner, the top 100 frequently used words for the different subjectivity classes were also computed and word cloud-based representations of the same are shown in
Figure 30,
Figure 31 and
Figure 32.
After performing this analysis, a similar word frequency-based analysis was performed for the different categories of toxic content that were detected in the tweets using Algorithm 6. These classes were toxicity, obscene, identity attack, insult, threat, and sexually explicit. As explained in Algorithm 6, each tweet was assigned a score for each of these classes and whichever class received the highest score, the label of the tweet was decided accordingly. For instance, if the toxicity score for a tweet was higher than the scores that the tweet received for the classes - obscene, identity attack, insult, threat, and sexually explicit, then the label of that tweet was assigned as toxicity. Similarly, if the obscene score for a tweet was higher than the scores that the tweet received for the classes - toxicity, identity attack, insult, threat, and sexually explicit, then the label of that tweet was assigned as obscene. The results of this word cloud-based analysis for the top 100 words (in terms of frequency) for each of these classes are shown in
Figure 33,
Figure 34,
Figure 35,
Figure 36,
Figure 37 and
Figure 38.
As can be seen from
Figure 33,
Figure 34,
Figure 35,
Figure 36,
Figure 37 and
Figure 38 the patterns of communication were diverse for each of the categories of toxic content designated by the classes - toxicity, identity attack, insult, threat, and sexually explicit. At the same time,
Figure 37 and
Figure 38 appear significantly different in terms of the top 100 words used. This also shows that for tweets that were categorized as threat (
Figure 37) and as containing sexually explicit content (
Figure 38) the paradigms of communication and information exchange in those tweets were very different as compared to tweets categorized into any of the remaining classes representing toxic content. In addition to performing this word cloud-based analysis, the scores each of these classes received were analyzed to infer the trends of their intensities over time. To perform this analysis, the mean value of each of these classes was computed per month and the results were plotted in a graphical manner as shown in
Figure 39.
From
Figure 39, several insights related to the tweeting patterns of the general public can be inferred. For instance, the intensity of toxicity was higher than the intensity of obscene, identity attack, insult, threat, and sexually explicit content. Similarly, the intensity of insult was higher than the intensity of obscene, identity attack, threat, and sexually explicit content. Next, gender-specific tweeting patterns for each of these categories of toxic content were analyzed to understand the trends of the same. These results are shown in
Figure 40,
Figure 41,
Figure 42,
Figure 43,
Figure 44 and
Figure 45. This analysis also helped to unravel multiple paradigms of tweeting behavior of different genders in the context of online learning during COVID-19. For instance,
Figure 40 and
Figure 44 show that the intensity of toxicity and threat in tweets by males and females has increased since July 2022. The analysis shown in
Figure 41, shows that the intensity of obscene content in tweets by males and females has decreased since May 2022.
The result of Algorithm 7 is shown in
Figure 46. As can be seen from this Figure, between males and females, the average activity of females has been higher in all months other than March 2022. The results from Algorithm 7 are presented in
Figure 47 and
Figure 48, respectively.
Figure 47 shows the trends in tweets about online learning during COVID-19 posted by males from different countries of the world. Similarly,
Figure 48 shows the trends in tweets about online learning during COVID-19 posted by females from different countries of the world.
Figure 47 and
Figure 48 reveal the patterns of posting tweets by males and females about online learning during COVID-19. These patterns include similarities as well as differences. For instance, from these two figures, it can be inferred that in India a higher percentage of the tweets were posted by males as compared to females. However, in Australia, a higher percentage of the tweets were posted by females as compared to males. Finally, a comparative study is presented in
Table 2 where the focus area of this work is compared with the focus areas of prior areas in this field to highlight its novelty and relevance. As can be seen from this Table, the work presented in this paper is the first work in this area of research where the focus area has included text analysis, sentiment analysis, analysis of toxic content, and subjectivity analysis of tweets about online learning during COVID-19. It is worth mentioning here that the work by Martinez et al. [
101] considered only two types of toxic content – insults and threats whereas the work presented in this paper performs the detection of six types of toxic content - toxicity, obscene, identity attack, insult, threat, and sexually explicit. Furthermore, no prior work in this field has performed a gender-specific analysis of tweets about online learning during COVID-19. As this paper analyzes the tweeting patterns in terms of gender, the authors would like to clarify three aspects. First, the results presented and discussed in this paper aim to address the research gaps in this field (as discussed in
Section 2). These results are not presented with the intention to comment on any gender directly or indirectly. Second, the authors respect the gender identity of every individual and do not intend to comment on the same in any manner by presenting these results. Third, the authors respect every gender identity and associated pronouns [
126]. The results presented in this paper take into account only three gender categories – male, female, and none as the GenderPerformr package (the current state-of-the-art method that predicts gender from usernames) has limitations.
5. Conclusions
To reduce the rapid spread of the SARS-CoV-2 virus, several universities, colleges, and schools across the world transitioned to online learning. This was associated with a range of emotions in students, educators, and the general public who used social media platforms such as Twitter during this time to share and exchange information, views, and perspectives related to online learning leading to the generation of Big Data. Twitter has been popular amongst researchers from different domains for the investigation of patterns of public discourse related to different topics. Furthermore, out of several social media platforms, Twitter has the highest gender gap as of 2023. There have been a few works published in the last few months where sentiment analysis of tweets about online learning during COVID-19 was performed. However, those works have multiple limitations centered around a lack of reporting from multiple sentiment analysis approaches, a lack of focus on subjectivity analysis, a lack of focus on toxicity analysis, and a lack of focus on gender-specific tweeting patterns. The work presented in this paper aims to address these research gaps as well as aims to contribute towards advancing research and development in this field. A dataset comprising about 50,000 Tweets about online learning during COVID-19, posted on Twitter between November 9, 2021, and July 13, 2022, was analyzed for this study. This work reports multiple novel findings. First, the results of sentiment analysis from VADER, Afinn, and TextBlob show that a higher percentage of the tweets were positive. The results of gender-specific sentiment analysis indicate that for positive tweets, negative tweets, and neutral tweets, between males and females, males posted a higher percentage of the tweets. Second, the results from subjectivity analysis show that the percentage of least opinionated, neutral opinionated, and highly opinionated tweets were 56.568%, 30.898%, and 12.534%, respectively. The gender-specific results for subjectivity analysis show that for each subjectivity class (least opinionated, neutral opinionated, and highly opinionated) males posted a higher percentage of tweets as compared to females. Third, toxicity detection was applied to the tweets to detect different categories of toxic content - toxicity, obscene, identity attack, insult, threat, and sexually explicit. The gender-specific analysis of the percentage of tweets posted by each gender in each of these categories revealed several novel insights. For instance, males posted a higher percentage of tweets that were categorized as toxicity, obscene, identity attack, insult, and threat, as compared to females. However, for the sexually explicit category, females posted a higher percentage of tweets as compared to males. Fourth, gender-specific tweeting patterns for each of these categories of toxic content were analyzed to understand the trends of the same. These results unraveled multiple paradigms of tweeting behavior of different genders in the context of online learning during COVID-19. For instance, the results show that the intensity of toxicity and threat in tweets by males and females has increased since July 2022. To add to this, the intensity of obscene content in tweets by males and females has decreased since May 2022. Fifth, the average activity of males and females per month in this time range was also investigated. The findings indicate that the average activity of females has been higher in all months as compared to males other than March 2022. Finally, country-specific tweeting patterns of males and females were also investigated which presented multiple novel insights. For instance, in India, a higher percentage of tweets about online learning during COVID-19 were posted by males as compared to females. However, in Australia, a higher percentage of such tweets were posted by females as compared to males. As per the best knowledge of the authors, no similar work has been done in this field thus far. Future work in this area would involve performing gender-specific topic modeling to investigate the similarities and differences in terms of the topics that have been represented in the tweets posted by males and females.
Author Contributions
Conceptualization, N.T.; methodology, N.T., S.C, K.K, Y.N.D.; software, N.T., S.C, K.K, Y.N.D., M.S.; validation, N.T.; formal analysis, N.T., K.K, S.C, Y.N.D., V.K.; investigation, N.T., K.K, S.C, Y.N.D.; resources, N.T., K.K, S.C, Y.N.D.; data curation, N.T and S.Q.; writing—original draft preparation, N.T., V.K., K.K, M.S., Y.N.D, S.C; writing—review and editing, N.T.; visualization, N.T., S.C, K.K, Y.N.D.; supervision, N.T.; project administration, N.T.; funding acquisition, Not Applicable.
Figure 1.
The variation of social media use by gender from the findings of a survey conducted by the Pew Research Center from 2005 to 2021.
Figure 1.
The variation of social media use by gender from the findings of a survey conducted by the Pew Research Center from 2005 to 2021.
Figure 2.
A flowchart representing the working of Algorithm 1 to Algorithm 6 for the development of the master dataset.
Figure 2.
A flowchart representing the working of Algorithm 1 to Algorithm 6 for the development of the master dataset.
Figure 3.
A pie chart to represent different genders from the “Gender” attribute.
Figure 3.
A pie chart to represent different genders from the “Gender” attribute.
Figure 4.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per VADER) in the tweets.
Figure 4.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per VADER) in the tweets.
Figure 5.
A pie chart to represent the percentage of positive tweets (as per VADER) posted by each gender.
Figure 5.
A pie chart to represent the percentage of positive tweets (as per VADER) posted by each gender.
Figure 6.
A pie chart to represent the percentage of negative tweets (as per VADER) posted by each gender.
Figure 6.
A pie chart to represent the percentage of negative tweets (as per VADER) posted by each gender.
Figure 7.
A pie chart to represent the percentage of neutral tweets (as per VADER) posted by each gender.
Figure 7.
A pie chart to represent the percentage of neutral tweets (as per VADER) posted by each gender.
Figure 8.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per Afinn) in the tweets.
Figure 8.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per Afinn) in the tweets.
Figure 9.
A pie chart to represent the percentage of positive tweets (as per Afinn) posted by each gender.
Figure 9.
A pie chart to represent the percentage of positive tweets (as per Afinn) posted by each gender.
Figure 10.
A pie chart to represent the percentage of negative tweets (as per Afinn) posted by each gender.
Figure 10.
A pie chart to represent the percentage of negative tweets (as per Afinn) posted by each gender.
Figure 11.
A pie chart to represent the percentage of neutral tweets (as per Afinn) posted by each gender.
Figure 11.
A pie chart to represent the percentage of neutral tweets (as per Afinn) posted by each gender.
Figure 12.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per TextBlob) in the tweets.
Figure 12.
A pie chart to represent the distribution of positive, negative, and neutral sentiments (as per TextBlob) in the tweets.
Figure 13.
A pie chart to represent the percentage of positive tweets (as per TextBlob) posted by each gender.
Figure 13.
A pie chart to represent the percentage of positive tweets (as per TextBlob) posted by each gender.
Figure 14.
A pie chart to represent the percentage of negative tweets (as per TextBlob) posted by each gender.
Figure 14.
A pie chart to represent the percentage of negative tweets (as per TextBlob) posted by each gender.
Figure 15.
A pie chart to represent the percentage of neutral tweets (as per TextBlob) posted by each gender.
Figure 15.
A pie chart to represent the percentage of neutral tweets (as per TextBlob) posted by each gender.
Figure 16.
A pie chart to represent the results of subjectivity analysis using TextBlob.
Figure 16.
A pie chart to represent the results of subjectivity analysis using TextBlob.
Figure 17.
A pie chart to represent the percentage of highly opinionated tweets (as per TextBlob) posted by each gender.
Figure 17.
A pie chart to represent the percentage of highly opinionated tweets (as per TextBlob) posted by each gender.
Figure 18.
A pie chart to represent the percentage of least opinionated tweets (as per TextBlob) posted by each gender.
Figure 18.
A pie chart to represent the percentage of least opinionated tweets (as per TextBlob) posted by each gender.
Figure 19.
A pie chart to represent the percentage of tweets for the neutral subjectivity class (as per TextBlob) posted by each gender.
Figure 19.
A pie chart to represent the percentage of tweets for the neutral subjectivity class (as per TextBlob) posted by each gender.
Figure 20.
Representation of the variation of different categories of toxic content present in the tweets.
Figure 20.
Representation of the variation of different categories of toxic content present in the tweets.
Figure 21.
A pie chart to represent the percentage of tweets for the toxicity class (as per Detoxify) posted by each gender.
Figure 21.
A pie chart to represent the percentage of tweets for the toxicity class (as per Detoxify) posted by each gender.
Figure 22.
A pie chart to represent the percentage of tweets for the obscene class (as per Detoxify) posted by each gender.
Figure 22.
A pie chart to represent the percentage of tweets for the obscene class (as per Detoxify) posted by each gender.
Figure 23.
A pie chart to represent the percentage of tweets for the identity attack class (as per Detoxify) posted by each gender.
Figure 23.
A pie chart to represent the percentage of tweets for the identity attack class (as per Detoxify) posted by each gender.
Figure 24.
A pie chart to represent the percentage of tweets for the insult class (as per Detoxify) posted by each gender.
Figure 24.
A pie chart to represent the percentage of tweets for the insult class (as per Detoxify) posted by each gender.
Figure 25.
A pie chart to represent the percentage of tweets for the threat class (as per Detoxify) posted by each gender.
Figure 25.
A pie chart to represent the percentage of tweets for the threat class (as per Detoxify) posted by each gender.
Figure 26.
A pie chart to represent the percentage of tweets for the sexually explicit class (as per Detoxify) posted by each gender.
Figure 26.
A pie chart to represent the percentage of tweets for the sexually explicit class (as per Detoxify) posted by each gender.
Figure 27.
A word cloud-based representation of the 100 most frequently used in positive tweets.
Figure 27.
A word cloud-based representation of the 100 most frequently used in positive tweets.
Figure 28.
A word cloud-based representation of the 100 most frequently used in negative tweets.
Figure 28.
A word cloud-based representation of the 100 most frequently used in negative tweets.
Figure 29.
A word cloud-based representation of the 100 most frequently used in neutral tweets.
Figure 29.
A word cloud-based representation of the 100 most frequently used in neutral tweets.
Figure 30.
A word cloud-based representation of the 100 most frequently used words in tweets that were highly opinionated.
Figure 30.
A word cloud-based representation of the 100 most frequently used words in tweets that were highly opinionated.
Figure 31.
A word cloud-based representation of the 100 most frequently used words in tweets that were least opinionated.
Figure 31.
A word cloud-based representation of the 100 most frequently used words in tweets that were least opinionated.
Figure 32.
A word cloud-based representation of the 100 most frequently used words in tweets that were categorized as having a neutral opinion.
Figure 32.
A word cloud-based representation of the 100 most frequently used words in tweets that were categorized as having a neutral opinion.
Figure 33.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the toxicity category.
Figure 33.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the toxicity category.
Figure 34.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the obscene category.
Figure 34.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the obscene category.
Figure 35.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the identity attack category.
Figure 35.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the identity attack category.
Figure 36.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the insult category.
Figure 36.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the insult category.
Figure 37.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the threat category.
Figure 37.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the threat category.
Figure 38.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the threat category.
Figure 38.
A word cloud-based representation of the 100 most frequently used words in tweets that belonged to the threat category.
Figure 39.
A graphical representation of the variation of the intensities of different categories of toxic content on a monthly basis.
Figure 39.
A graphical representation of the variation of the intensities of different categories of toxic content on a monthly basis.
Figure 40.
A graphical representation of the variation of the intensity of toxicity on a monthly basis by different genders.
Figure 40.
A graphical representation of the variation of the intensity of toxicity on a monthly basis by different genders.
Figure 41.
A graphical representation of the variation of the intensity of obscene content on a monthly basis by different genders.
Figure 41.
A graphical representation of the variation of the intensity of obscene content on a monthly basis by different genders.
Figure 42.
A graphical representation of the variation of the intensity of identity attacks on a monthly basis by different genders.
Figure 42.
A graphical representation of the variation of the intensity of identity attacks on a monthly basis by different genders.
Figure 43.
A graphical representation of the variation of the intensity of insult on a monthly basis by different genders.
Figure 43.
A graphical representation of the variation of the intensity of insult on a monthly basis by different genders.
Figure 44.
A graphical representation of the variation of the intensity of threat on a monthly basis by different genders.
Figure 44.
A graphical representation of the variation of the intensity of threat on a monthly basis by different genders.
Figure 45.
A graphical representation of the variation of the intensity of sexually explicit content on a monthly basis by different genders.
Figure 45.
A graphical representation of the variation of the intensity of sexually explicit content on a monthly basis by different genders.
Figure 46.
A graphical representation of the variation of the average activity on twitter (in the context of tweeting about online learning during COVID-19) on a monthly basis.
Figure 46.
A graphical representation of the variation of the average activity on twitter (in the context of tweeting about online learning during COVID-19) on a monthly basis.
Figure 47.
Representation of the trends in tweets about online learning during COVID-19 posted by males from different countries of the world.
Figure 47.
Representation of the trends in tweets about online learning during COVID-19 posted by males from different countries of the world.
Figure 48.
Representation of the trends in tweets about online learning during COVID-19 posted by females from different countries of the world.
Figure 48.
Representation of the trends in tweets about online learning during COVID-19 posted by females from different countries of the world.
Table 1.
Gender Diversity in Different Social Media Platforms.
Table 1.
Gender Diversity in Different Social Media Platforms.
Social Media Platform |
Percentage of Male Users |
Percentage of Female Users |
Twitter |
63 |
37 |
Instagram Tumblr WhatsApp WeChat Quora Facebook LinkedIn Telegram Sina Weibo QQ SnapChat |
51.8 52 53.2 53.5 55 56.3 57.2 58.6 51 51.7 48.2 |
48.2 48 46.7 46.5 45 43.7 42.8 41.4 49 48.3 51 |
Table 2.
A comparative study of this work with prior works in this field in terms of focus areas.
Table 2.
A comparative study of this work with prior works in this field in terms of focus areas.
Work |
Text Analysis of Tweets about Online Learning during COVID-19 |
Sentiment Analysis of Tweets about Online Learning during COVID-19 |
Analysis of types of toxic content in Tweets about Online Learning during COVID-19 |
Subjectivity Analysis of Tweets about Online Learning during COVID-19 |
Sahir et al. [20] |
|
✓ |
|
|
Althagafi et al. [21] |
|
✓ |
|
|
Ali et al. [22] |
|
✓ |
|
✓ |
Alcober et al. [23] |
|
✓ |
|
|
Remali et al. [24] |
|
✓ |
|
|
Senadhira et al. [25] |
✓ |
✓ |
|
|
Lubis et al. [26] |
✓ |
✓ |
|
|
Arambepola [27] |
✓ |
✓ |
|
|
Isnain et al. [28] |
✓ |
✓ |
|
|
Aljabri et al. [29] |
✓ |
✓ |
|
|
Asare et al. [30] |
✓ |
✓ |
|
✓ |
Mujahid et al. [31] |
✓ |
✓ |
|
|
Al-Obeidat et al. [32] |
|
✓ |
|
|
Waheeb et al. [33] |
✓ |
✓ |
|
|
Rijal et al. [34] |
|
✓ |
|
|
Martinez [36] |
|
|
✓ |
|
Thakur et al. [this work]
|
✓ |
✓ |
✓ |
✓ |