This research is focused on understanding and addressing the phenomenon of filter bubbles and the resulting polarization on social media platforms. As we keep increasing our dependency on these platforms as primary sources of news and communication, it has become critical to analyze the hazards of content recommendation and the resulting impacts of this algorithmic process on society at large. From what we’ve gathered, this work represents a groundbreaking approach to address to this complex issue. Previous works in this domain were limited to picking on some aspects affecting the polarization and tweaking them whereas this research explores a variety of factors and makes sure to quantify the polarization level of a network by assessing every potential and impacting factor. After that we shed light on how just making a small number of tweakings can result in a considerably less polarized network which can break through the confines of echo chambers and create a more balanced and inclusive society. In this way, this work relates to the problem of recommendation systems based on serendipity as we aim to develop a recommendation system based on aligning polarized users with each other.
2.2. Polarization Measures
Polarization detection is mostly done using graph and network based techniques to detect clusters/communities in graph nodes. Akoglu [
30] models this problem as a node classification task on edge-signed bipartite opinion networks. This work is specific only to networks that can be modelled as bipartite graphs, which might not be the most general case. The next work [
31] we discuss sets up this problem of quantifying polarization level of a social network as a very interesting boundary problem. They consider a graph divided into two communities G1 and G2, and each community has a boundary. They propose that a boundary node is that node which has connection with the other community whereas, an internal node is a node that is living in its silos without having any direct connection with the other community. They then argue that the polarization can be measured as the proportion of the edges the boundary nodes have with the members of the internal nodes of its community. If the boundary nodes have more edges with the members of its own community than the members of the other community then the network will be more polarized however if the number of connections (edges) between the boundary nodes and the members of its own community are lesser than the edges the boundary nodes have with the nodes of the other community then the network will be less polarized. The measure they propose makes sure that polarization lies between -0.5 and 0.5. They argue that the concentration of high-degree nodes in the boundary can correspond to the absence of polarization which is not always true so this method can only be applied to some graphs because it doesn’t consider the structure of the network properly. For example, the case where nodes connected to internal nodes of one community are far greater in number than such nodes of other community cannot be considered through this measure.
Another work [
32] introduces a term called center of gravity to find out how polarized a network is. They begin by partitioning the nodes into two sets i.e. elite and listeners. The elites have constant opinions of their own. The listeners, on the other hand, have an initial opinion of 0 and at each time step they become the average of their neighbors’ opinion. They denote the fraction of people holding a positive opinion as
and those holding a negative opinion as
. The polarization is computed by finding out the how far the average opinion of one community is from the other community.
While the idea of calculating polarization based on distance is quite innovative, this distance metric doesn’t take into account the structure of network properly, so the work of Hohmann et al. [
33] uses a more holistic Generalized Euclidean (GE) distance measure, that takes network structure into account as well and calculates the effort required to move between different opinions in the network.
This means quantitatively measuring how different people’s beliefs or opinions are on a particular subject. It addresses different components that add to the network polarization:
Opinion component: How the people’s ideologies diverge.
Structural component: How the network is structured (Person X is friends with Person Y). Connections with like-minded individuals. If there is no community structure, each individual is connected to every other individual and thus exposed to multiple views. If there are clear communities, individuals will only be exposed to ideas within that community.
-
Interplay between Opinion component and Structural Component:
- a
Depending on the system’s meso level organization, the same communities and opinions might produce varying degrees of polarization. Communities that can openly connect with one another regardless of their political views show less polarization than those that form in increasingly severe echo chambers. The polarization measure
is modeled as a node vector distance problem. L = estimate of the effective “resistance” between two nodes in a system. This is done using a pseudo-inverse Laplacian to estimate the effective resistance. The result of this operation is then inverted (Moore-Penrose pseudo-inverse) to get L which gives us a good notion of the distance between two vectors say a and b. They divided the vector o into the vectors o+ and o in order to apply GE to estimate polarization. O- contains the absolute value of all negative opinions and 0 otherwise, while O+ contains all positive opinions After that, their
measure of polarization becomes:
The average "distance" between randomly selected nodes in o+ and o, weighted by how fervently these nodes believe what they believe (for example, the distance between two nodes with opinions +1 and 1 is weighted more than if the nodes had opinions 0.1 and +0.1), can be interpreted as .
Next we discuss another method [
34] that uses a popular and classical opinion formation model called Friedkin and Johnsen (1990). Within this model, opinions are modeled as real numbers between
. Each user has an internal opinion
that is given as an input (and it is fixed). This user also has an expressed opinion
. This expressed opinion depends on their internal opinion and opinions in the social network. Within a social network, if a user
U takes a random walk,
will be the expected opinion the user will reach. High values of
means the individual is surrounded by like-minded individuals with extreme opinions and low value of
means the
U’s social media network has moderate and diverse opinions.
is the degree of polarization of user
U. We measure this by looking at the length of the vector
z under the
norm.
The Friedkin and Johnsen (1990) model is used to determine polarization because it makes the assumption that each person
i has a persistent internal opinion
and an expression opinion
that is influenced by both their internal opinion and the opinions of their neighbors.
Where
is the weight an individual goes to their own internal opinion. Using this, they finally converge to
Z, which is the opinion vector for the whole network. Then they define the polarization index, which is:
The weights and have an impact on the probability as well because they determine the likelihood that a certain edge will be followed. High weight, for instance, indicates that the user is more likely to become immersed in her own opinion node than to follow a path through the network to another node. If a node has an equal likelihood of reaching both positive and negative opinions, or a balanced perspective of the network’s thoughts, the value for that node is minimized. The value of will be high, however, if the user is confined to a filter-bubble of friends who share their radical viewpoints.
Cinus et al. [
35] investigate the role of people recommender algorithms, such as "People You May Know" or "Who to Follow" features on social media, in creating echo chambers and polarization. These algorithms typically suggest connections based on network structure and shared interests, often leading to homophilic (similar) links. The paper evaluates three such algorithms: Directed Jaccard index, Personalized PageRank, and Opinion-biased algorithm, using two opinion dynamic models, Bounded Confidence Model (BCM) and Epistemological model, to simulate opinion changes during user interactions.
BCM assumes that interactions only change opinions when they are within a certain confidence interval, while the Epistemological model assumes that one opinion is factual and the other is its negation.
The paper uses two global metrics to measure the impact of recommender systems on echo chambers and polarization: Neighbors Correlation Index (NCI) and Random Walk Controversy Score (RWC). NCI measures the correlation between a node’s opinion and the average opinion of its neighbors. A value of -1 indicates perfect anti-correlation, while a value of 1 indicates perfect correlation.
RWC measures the difference in probabilities of two events: both random walks starting and ending in the same partition, and both random walks starting and ending in different partitions. High RWC values indicate low probability of crossing partitions (high polarization), while low values indicate high probability of crossing partitions (low polarization). However, this measure does not account for the size of communities or the total degree of nodes in the partitions, which are significant factors affecting polarization.
The study [
36] explores the behavior of individuals within echo chambers or filter bubbles, where users are surrounded by like-minded individuals and there is little to no difference in opinions. It examines how these communities evolve over time, considering user activity and emotions expressed. The study uses three growth models: the Gompertz model, the Logistic model, and the Log-logistic model, and finds that both science-focused and conspiracy-focused communities show similar growth patterns.
The study measures community growth using user commenting activity as a proxy for engagement. Users are classified into three groups based on their commenting frequency, and their temporal evolution is tracked. A method is developed to classify users as "Scientific" or "Conspiracy Theorist" based on their interactions with labeled data on Facebook.
The study also conducts a sentiment analysis to model users’ emotional behavior as a function of their community involvement. It uses a Machine Learning Sentiment Classifier to classify comments into positive, neutral, and negative sentiments. Three metrics are used: Mean User Sentiment, Mean negative/positive difference of comments, and User sentiment polarization.
At the community level, the study calculates the Community negative/positive difference of comments and Mean community sentiment polarization. The findings suggest that as users become more active within echo chambers, they tend to express increasingly negative sentiments. Furthermore, user sentiment polarization is generally higher for science users compared to conspiracy users. However, as activity increases, science users tend to decrease their sentiment polarization, while conspiracy users tend to increase it.
The paper [
37] introduces a different approach to opinion dynamics, where changes in opinion are observed over distinct steps or time intervals. The paper uses the Friedkin-Johnsen (FJ) model, which is known to converge to a set of equilibrium opinions over time. The expressed opinion of a node is calculated as a function of its internal opinion and the weighted sum of the opinions of its neighbors, divided by the degree of the vertex. The paper defines the concept of polarization as the mean square error of the opinions in the network, with respect to the mean opinion. The polarization ranges from 0 (when all opinions are equal) to N (when half of the opinions are at -1 and the other half at 1). The paper also introduces the concept of disagreement, both at a local and global level. Local disagreement is defined for a node as the sum of the squared differences between its opinion and the opinions of its neighbors, weighted by the connection strength. Global disagreement is an aggregate measure, defined as the sum of local disagreements for all nodes, divided by two to ensure each edge is only counted once.
2.3. Controversy Measures
The paper [
38] presents a unique approach to quantifying the polarization level of a network by calculating a controversy score. This score is based on edge betweenness centrality, a measure of the significance of each edge in a network, which is determined by how often an edge lies on the shortest paths between different pairs of nodes.
In this context, a graph is divided into two communities with opposing opinions on a topic. The controversy measure is used to determine how well-separated these two communities are. The betweenness centrality of an edge is defined as the ratio of the number of shortest paths between nodes s and t that include the edge e, to the total number of shortest paths between nodes s and t.
The Kullback-Leibler (KL) divergence is then computed for the distribution of edge betweenness centrality of the crossing edges (those connecting the two communities) and the internal edges (those within a single community). The Betweenness Centrality Controversy (BCC) of the graph is defined as:
The BCC value ranges from 0 to 1. A BCC value of 0 indicates a small divergence, meaning the betweenness centrality of the crossing and internal edges are similar, suggesting a diverse graph. A BCC value of 1 indicates a large divergence, meaning the betweenness centrality of the crossing and internal edges are very different, suggesting a polarized graph.
However, this model does not consider the strength of opinions across either side and assumes that opinions take one of two possible values instead of a spectrum, which may overlook some important details.
2.5. Methods of Polarization Reduction
We have come across various methods of computing polarization, after evaluating how polarized a given network is the next step becomes to find out optimal methods that enable us to reduce this polarization. This has implications in real life as well as it has been shown that social polarization hampers the economic growth of a society [
44] so reducing it is imperative to ensure a stable and smoothly running community. For this purpose, we have done a detailed review of methods that can be used to tackle this polarization problem and help us reduce it.
Garimella et al. [
45] proposes an approach to add new edges to the graph in order to provide exposure to opposing points of view. Likewise, [
46] adopts an information diffusion approach. It proposes an algorithm to select seed nodes such that the exposure of nodes in a graph, as a result of information propagation, is balanced. Given two opposing campaigns, it selects seed nodes such that after information is propagated through those nodes, the resulting information exposure of the graph is balanced. Moreover, Musco et al. [
47] formulate the problem as selecting a set of nodes such that it minimises polarity as well as the disagreement that would occur due to linking of individuals with different opinions. It proposes an approach to find the optimal structure of a graph that would minimise both objectives. The former work focuses on setting the opinions of selected nodes to zero, corresponding to a neutral opinion and does not consider the possibility of changing the individual leaning towards the counter opinion. The latter aims to change the network topology which might not be realistic in most scenarios. All these approaches share a common objective of minimising some measure of polarization by making recommendations to a subset of the nodes.
In [
28], the authors assume binary opinions such that the opinion vector
. Given
(the weighted degree of node
i),
(the cost of changing node
i’s opinion), and a budget
k, the authors select the nodes with the highest values of
and flip their opinions. If all edge weights and node costs are set to 1, the problem simplifies to selecting
k nodes with the highest degrees. The authors report the results of flipping the opinions of the top
,
, and
n nodes in their study. This method seems to break filter bubbles and increase the diversity of information exposure among connected individuals in social networks. However, flipping the opinions of such a large number of people in real life is not very practical or realistic.
In [
29], the authors study the phenomenon of echo chambers on a Twitter dataset from 2017 and maximize the diversity of content exposed to users using a quadratic program that finds the best recommendations to show to a user. The paper focuses on optimizing personalized content recommendation policies to maximize the average diversity of newsfeeds across the platform.
Another method [
34] aims to reduce the overall polarizability of a given network by convincing people to adopt a more moderate opinion. Given a budget value
K, this research focuses on identifying the best set of individuals, where moderating their opinion will reduce the polarization of the whole network the most. They further define two variants of the problem:
Moderate Internal: Attempt to moderate the internal opinion of individuals and bring it to (through educational interventions).
Moderate External: Attempt to moderate the external opinion of individuals and bring it to 0 (this can be done through incentives).
Having discussed various methods to quantify and reduce polarization along with their limitations, we will now focus on devising a unique system to measure and reduce polarization. We will do so while considering all major confines of the previous works in the subsequent discussion.