001. Parler Analysis-Part I: A Glance at Online Extremism
Have you heard about… Parler?
A plethora of social media emerges to the public sight every year. While some of them made their breakthrough by inventing new features, others paved their way out by catering to specific communities, for example, alt-right and extremist communities. Walking on the wild side, these platforms gained increasing popularity after their mainstream counterparts, the “Big Tech”, namely Facebook and Twitter, have taken actions to moderate the user content (Jasser et al., 2021). Branding themselves with selling points like “free speech” and “express yourself openly” (Parler, 2021), they attract people who feel pressured by the content regulation, and were blocked or suspended by the platform moderators. Unfortunately, and ironically, these platforms didn’t evolve to a bona fide free place, but an echo chamber for extreme right-wing ideologies, tin-foil-hat conspiracies, and flaky-loony misinformation.
Parler is one of these platforms and sadly not the only one. It was on the front line for promoting the misinformation about 2020 U.S. presidential election frauds. We all know what that misinformation entailed in the real life, the Capitol riots in January 2021. And Parler, as the main hub for these conspiracies and misinformation, was also considered to be responsible for the coup (Munn, 2021). This analysis got off the ground around that time as an attempt to understand how social media added fuel to the fires. The data was obtained from open-source data scraping efforts dedicated to Parler called CapitolResources (https://github.com/rljacobson/CapitolResources/) before it was shut down by Amazon Web Service. The dataset contains 1.74 million text entries with some metadata. The analysis was conducted primarily in an exploratory fashion. Considering the distinctive nature of different analytical approaches and the type of data that were used in the analysis, I’d like to separate them into different parts for the blog posts.
Before we get to know more about Parler, it’s useful to know its specific lingua and features.
A user on Parler is called a Parleyer.
A post on Parler is referred as Parley.
A user can “echo” to a parley just like you can retweet on Twitter.
A user can also “vote” to a parley just as you can like a tweet on Twitter.
The number of times a Parley appears in others’ feed is called “Impression”. It is a metric for passive view counts.
What will be shown in this post? What have I done to the data and how?
In this first post of the Parler series, we will explore the use of external media resources on Parler and Parler’s most strikingly outstanding media bias. Media bias has been extensively researched for other social media, like Twitter (An et al., 2012). And it is of great importance as an instrument for understanding other topics, for instance, political leaning, misinformation dissemination, and selective information behaviors (Dinkov et al., 2019; Flaxman et al., 2016). Although I have merely stated that Parler is an alt-right social media, it’s more convincing once you see what media sources were used and how they circulated on the platform.
Even without the assumption that Parler is a highly biased platform, we could still understand what sources that Parleyers acknowledge and give credence to. Normally, we cite others’ work as an external endorsement for our own arguments when we write papers. By the same token, we can somehow argue that the sources that Parleyers linked to in their posts are the ones that passed their trust test.
Because we are only looking at the sources referenced in the posts, I am only using a subset of the data that have external links embedded. Furthermore, within all posts with external links, I will also focus on the original posts, because echoes—just like retweets—are just repetitions of the original posts.
Measuring media bias is a different story. For simplicity, I’m using a matching strategy. I scraped the Media Bias/Fact Check (MBFC) website and compiled a list of websites and their categories assigned by MBFC. Used in other research, MBFC is one of the known databases for media bias (Baly et al., 2018). The data harvested from MBFC not only has categories for political bias, namely from left to right, it also provides a factuality scale that indicates the reliability of the source. For each of the URLs that I found on Parler, I will try to match it to my MBFC database. Some simple preprocessing of the URLs has been performed. I first trimmed down the URLs on Parler to only keep the root domain, for example, https://www.bbc.com/news/business-59236432 will be trimmed down to www.bbc.com. This preprocessing ensured the alignment of the URL format between the scraped MBFC data and those on Parler.
The MBFC database I scrapped back in February 2021 has cataloged 3,061 websites. For the political bias, I used the 7-scale system that varies from extreme left to extreme right. Their distribution on the MBFC website was shown in Figure 1. Generally speaking, there are more left-leaning sites listed on their catalog.
About the factuality, the distribution of sites on MBFC is shown in Figure 2. Again, there’s no balanced distribution on all categories of factuality. More data points are from the sites with high factuality.
With all the data prepared and the overarching direction of investigating external references defined, I will look into these specific questions in the following content:
How likely are Parleyers to reference external sources in their posts?
What are the most common sources being referenced on Parler? What do those sources look like in terms of bias and factuality?
How does the use of external sources interplay with user participation and interaction? Are the posts with a certain kind of external sources more well-received or not?
Buckle up, things start to get wild.
Among 1.7m posts including original posts and echoes, 331k of them used external sources. Most of them only referenced one article in their post (327k), a few used two external links (2.9k) and very few have more than two external links in their posts.
112 different top-level domains (TLD) were found in these 331k posts. The most common one was no doubt .com. The top fifty most frequent domains were shown in Figure 3.
You might want to know the details of sites with peculiar TLDs. I’ve hand-picked some top-ranking ones and listed them here.
Today: americanconservatives.today
Club: bencarsonteam.club, conservativefighters.club, voiceofamerica.club, buzzinewsalart.club, allenwest.club, americanpatriots.club, bencarson.club
Site: silentmajorityusa.site, americanconservatives.site, proudamerican.site, worldwidenews.site, wearenotsilent.site, theinsidepal.site, inpalestine.site, pgbc.snappages.site
Online: www.thepolitics.online, magafeeds.online, stopcensura.online, freedomforall.online, www.thinkaboutit.online
Wiki: www.stopthesteal.wiki, therabbithole.wiki
The 331k posts that have external links only come from 3450 unique domains and 710 of them can be found in my MBFC dataset. That being said, these 710 sites still managed to cover 225k posts, which takes up 68% of the posts with links. Filtering out the echoes, the dataset shrunk to 227k posts. Within these original posts, only 36k posts have external links, composing only 15.8% of all original posts and 11% of the posts that have external links. This suggests a 1:9 ratio for echoes and original posts in all posts with external links. It somehow indicates that a majority of the posts are just repeating each other rather than creating new and original content. Or, the major interaction pattern, for parleryers and posts with external sources, is passive, considering that echoes didn’t bring anything new to the conversation.
I successfully matched 24k original posts (66.6%), which comes from 610 unique domains. As shown in Figure 5, of all the posts that referred to external sources, most of them referred to right-biased sources (89.15%) and almost half of them are using extreme-right biased sources. On the other hand, original posts with left-leaning sources are few and far between. When we only calculated the counts of unique sites, it is another story. A lot of left-biased sources have been cited on Parler: 136 (22.41%) left-leaning websites are found in all original posts. Following this are least biased sources, right-biased sources, and extreme-right biased sources. The distribution of the bias for unique sites, though still skews towards right-biased sources, somehow resembles that of the MBFC catalogs. It also has the biggest collection for left center biased or least biased sources.
The differences shown in the plot imply that a myriad of original posts used a small group of extreme-right or right biased sources while the use of left-leaning sources is rather erratic and random. The use of right-wing sources is generally consistent, indicating that the users are very versed with certain sources, so they might have developed a habit of citing these sources regularly. Another explanation could be that there is a huge overlap between the users of Parler and extreme right media sources, so they naturally disseminate whatever is on the extreme right sites. However, when they tried to cite left-leaning sources, they tend to pick the ones that fit into their arguments or points—and because not every left-leaning site will cover their points—they have to selectively search or even cherry-pick the sites that work for their narrative.
Figure 6 shows the frequency of individual websites referred to Parler, grouped by the bias. You might see some familiar names here, for example, the Gateway Pundit, Breitbart News Network, and the Epoch Times. It’s no surprise to see these sites taken the lion’s share of the frequencies but it also just confirmed the nature of Parler as an extreme far-right bubble.
I did the same analysis for factuality as is shown in Figure 7. We observed the same pattern as that of the media bias. Parler posts primarily used questionable sources (82.49%). However, a diversity of high-reliability sources was cited. Again, a drastic post-cite ratio also indicates the prevalence and consistency of citing a shortlist of low-quality websites. The distribution of most frequent sites being reposted grouped by factuality is shown in Figure 8. There was also a huge overlap of low factuality source and right biased source.
The MBFC datasets additionally introduced several singular categories for websites, namely Conspiracy, Pseudo-Science, and Pro-Science. Figure 9 shows the distribution for websites that were flagged as conspiracy and pseudoscience and their severity. Although only a few posts were matched for these two categories, all matched websites are more likely to be severe conspiracy sites and strong pseudo-science sites. They are more of go hard or go home type of posts. The matched posts for prescience and satire sources are peripheral, only 68 and 104 posts cited these sources respectively.
The next question I outlined above is to see how posts with certain kinds of bias or factuality interacted with the users. I followed a straightforward approach: modeling the user interaction metrics (impressions, comments, votes, etc.) with the bias and factuality of the referenced source. Because the distribution for the dependent variables is usually very dispersed, I used the Negative Binomial Regression as the technique.
The first batch of models explored the relationship between bias and user interactions. The model chose extreme left as the referenced value for other biases. All coefficients for the extreme right, right, and right-center are significant and positive. This implies that the log-transformed counts of one of the interaction-based metrics is expected to increase, given the other predictor variables in the model are held constant when compared with post extreme left sources. This result suggests that posts with the extreme left are the least received sources that usually induced the least amount of user interaction. The exception is for the comments model whose coefficient is significant but negative. Using incidence rate ratio (IRR = 0.3) for a more intuitive explanation, posts with left-biased sources were received worse than posts with extreme left sources. On top of this, it is surprising to see least biased sources obtained the highest coefficient across the models.
Table 1 Bias as Independent Variables
The same modeling was conducted for factuality. The models in Table 2 all have the factuality of High as the referenced value. Almost all coefficients are significant and negative, indicating that posts with high factuality sources are the most popular posts on Parler. However, generally, posts with very high factuality sources and mostly factual sources are less received than posts with low, mixed, and very low factuality, since their coefficients are usually larger.
Table 2 Factuality as Independent Variables
I further dug into the dataset to investigate the influential outliers. I filtered out the records whose Cook’s d value are more than 1 using the initial “impression ~ bias” model. The corrected models made a noticeable difference to the fitness and the outcome was shown in Table 3 and Table 4.
Table 3 Bias as Independent Variables (Without Outliers)
Table 4 Factuality as Independent Variables (Without Outliers)
This time, for the bias model, we observed a shift of coefficient values—the factuality of Right has the highest coefficient this time and extreme right and right-center were on par with least bias. The factuality model witnessed more changes since not all coefficients are negative now. It showed that compared to posts with high factuality, posts with mixed sources are more likely to grab more user attention. The same situation also applied to posts with low factual sources. Posts with very high factuality were least well-received in the new model.
Bivariate heatmaps were plotted to visualize the influence of the posts as can be seen in Figure 10 and Figure 11. They showed the sum of each of the interaction metrics grouped by bias and factuality. The hotspots were concentrated on the center to top right corner, indicating a higher value on the metrics when the bias leans to right and the factuality changes to mixed and low.
Same heatmaps were generated for the posts with conspiracy and pseudo-science sources (See Figure 12 & Figure 13). Posts with strong pseudo-science and extreme conspiracy sources seemed to attract a lot of interactions, so as the posts with mild pseudo-science and strong conspiracy sources.
In this post, I “did my own research” to peek into the world of Parler, a “free speech” social media catering to a specific online community. Using simple visual analytics and statistics, we discover that:
Parler is a platform with highly passive and conforming interactions, implied by its 1:9 original-echo ratio.
Parler posts are rarely linked to external sources, indicating the discourse is less likely to be “bolstered” by a third-party source. This was suggested by a 15% rate for original posts with external sources.
Parler appears to be a hub for far-right, misinformational, conspiratorial, and pseudo-science ideas. This was verified by the disproportionate and dominating number of posts with the extreme-right and mixed-factuality sources, along with the high site-post ratio for the posts with these characteristics.
Parler is also an echo chamber for biased and dubious sources. The regression models justified the assumption that more user interactions were attracted by the posts with external sources with the above-mentioned bias and factuality.
Reference
An, J., Cha, M., Gummadi, K., Crowcroft, J., & Quercia, D. (2012). Visualizing media bias through Twitter. Sixth International AAAI Conference on Weblogs and Social Media.
Baly, R., Karadzhov, G., Alexandrov, D., Glass, J., & Nakov, P. (2018). Predicting factuality of reporting and bias of news media sources. ArXiv Preprint ArXiv:1810.01765.
Dinkov, Y., Ali, A., Koychev, I., & Nakov, P. (2019). Predicting the leading political ideology of YouTube channels using acoustic, textual, and metadata information. ArXiv Preprint ArXiv:1910.08948.
Flaxman, S., Goel, S., & Rao, J. M. (2016). Filter Bubbles, Echo Chambers, and Online News Consumption. Public Opinion Quarterly, 80(S1), 298–320. https://doi.org/10.1093/poq/nfw006
Jasser, G., McSwiney, J., Pertwee, E., & Zannettou, S. (2021). ‘Welcome to #GabFam’: Far-right virtual community on Gab. New Media & Society, 14614448211024546. https://doi.org/10.1177/14614448211024546
Munn, L. (2021). More than a mob: Parler as preparatory media for the U.S. Capitol storming. First Monday. https://doi.org/10.5210/fm.v26i3.11574
Parler. (2021). Parler Values. https://parler.com/values.php