Raazesh completed his PhD in statistics at Cornell University and a research fellowship at Oxford University in mathematical genetics before taking up a faculty position in Canterbury University’s Maths & Stats Department for a decade. He currently has a joint appointment as data science consultant in AI & Analytics Centre of Excellence, Combient AB, Stockholm and as Researcher in Department of Mathematics, Uppsala University, Uppsala. He works at the interface of computing, mathematics and statistics to solve real-world decision problems using custom-built mathematical and statistical models that can scale to big data using Apache Spark and its ecosystem.
October 2, 2018 05:00 PM PT
We characterize the Twitter networks of both major presidential candidates, Donald Trump and Hillary Clinton, with various American hate groups defined by the US Southern Poverty Law Center (SPLC). We further examined the Twitter networks for Bernie Sanders, Ted Cruz, and Paul Ryan, for 9 weeks around the 2016 election (4 weeks prior to the election and 4 weeks post-election).
By carefully accounting for the observed heterogeneity in the Twitter activity levels across individuals under the null hypothesis of apathetic retweeting that is formalized as a random network model based on the directed, multi-edged, self-looped, configuration model, our data revealed via a generalized Fisher's exact test that there were significantly many Twitter accounts linked to SPLC-defined hate groups belonging to seven ideologies (Anti-Government, AntiImmigrant, Anti-LGBT, Anti-Muslim, Alt-Right, Neo-Nazi, and White-Nationalist) and also to @realDonaldTrump relative to the accounts of the other four politicians.
The exact hypothesis test uses Apache Spark's distributed sort and join algorithms to produce exact samples in a fully scalable way from the null model. Additionally, by exploring the empirical Twitter network we found that significantly more individuals had the fewest retweet degrees of separation simultaneously from Trump and each one of these seven hateful ideologies relative to the other four politicians. We conduct this exploration via a geometric model of the observed retweet network, distributed vertex programs in Spark's GraphX library and a visual summary through neighbor joined population retweet ideological trees.
Remarkably, less than 5% of individuals had three or fewer retweet degrees of separation simultaneously from Trump and one of several hateful ideologies relative to the other four politicians. Taken together, these findings suggest that Trump may have indeed possessed unique appeal to individuals drawn to hateful ideologies, however such individuals constituted a small fraction of the sampled population.
Session hashtag: #SAISDS7