We Analyzed Every Twitter Account Following Donald Trump: 61% Are Bots, Spam, Inactive, or Propaganda

Last week, SparkToro launched our third free tool, a service that analyzes Twitter accounts to estimate what percent of their followers are bots, spam, propaganda, or inactive accounts. By far the account that’s been most checked by users of the tool is @realdonaldtrump. Since there’s great interest in that account, we decided to expand our usual sample analysis (which looks at a random group of 2,000 followers) and run every single one of the account’s 54,788,369 followers through our system.

The results are… interesting.

As you can see, @realdonaldtrump is vastly above the median for these types of followers. For comparison, here’s how a number of other prominent American politicians look:

  • Donald Trump (President) – 61.0%
  • Kamala Harris (Senator) – 24.4%
  • Ted Cruz (Senator) – 26.0%
  • Susan Collins (Senator) – 24.6%
  • Mike Pence (Vice President) – 41.5%
  • Al Gore (Former Vice President) – 41.0%
  • Beto O’Rourke (Congressman) – 22.7%
  • Barack Obama (former President) – 40.9%
  • Mitch McConnell (Senator) – 31.3%
  • Jerry Brown (Governor) – 50.0%
  • Lindsey Graham (Senator) – 25.3%
  • Elizabeth Warren (Senator) – 33.7%
  • Hillary Clinton (former Senator) – 43.8%

The @realdonaldtrump account has significantly higher numbers than any other American politician we could find. That said, a few accounts do eclipse this, includingĀ @tina_kandelaki (a Russian journalist and personality) with 82.0% fake followers andĀ Demokrat_TV (an Indonesian news source) with 69.2% fake followers.

It’s not just the high percent of fake followers that’s surprising, but the distribution of follower quality on the account:

More than 35% of @realdonaldtrump’s followers are accounts that trigger 10+ different spam/fake follower signals (and thus have the lowest possible quality score, “1”). When compared to the distributions of other accounts, it seems likely that @realdonaldtrump acquired significantly more of these highly unusual and suspicious followers than others. We can speculate about why, but these numbers can make you feel confident in saying that the account likely reaches far less than half of the follower number reported by Twitter (at least, on that platform).

Methodology

SparkToro’s analysis uses a machine-learning based process to uncover numerous signals that correlate with spam/bots/inactive/propaganda accounts. We then analyze how these signals compare for any given Twitter account’s followers. In a standard analysis, we look at a random 2,000 followers, but in the case of @realdonaldtrump, we analyzed every single one of the 54+ million accounts in question.

It’s extremely important to keep in mind that NO ONE SIGNAL means we’d treat an account as likely spam/bot/inactive/propaganda. An account needs a combination of 7-10+ of these signals to be counted as “fake.” Here’s how those signals stacked up for @realdonaldtrump:

  • 72% have been inactive for 120+ days (i.e. the account did not send any tweets or RTs during that time)
  • 3% have been inactive for between 90-120 days
  • 3% created their account in the last 90 days
  • 36% use Twitter’s default profile image
  • 39% use display names that include spam words+patterns
  • 92% either don’t use a URL in their profile or employ a URL with spam patterns
  • 60% don’t use a recognized location
  • 27% have set their language to something other than English
  • 54% have gone more than a year without sending more than a handful of tweets
  • 3% send an abnormally large number of tweets per day
  • 96% have been placed on very few (or zero) lists
  • 79% have an unusually small number of followers
  • 76% follow an unusual number of accounts
  • 74% employ spam-correlated keywords in their profile description

I have this sinking feeling that someone will undoubtedly point to one of these signals and say “but that doesn’t mean the account is spam/bot/propaganda/inactive!” so let me state this again: no one, two, three, or even six signals means we’ll treat an account as low quality. But, I think we’d all agree (and the machine learning model built off spam+bot accounts would concur) that it’s very rare that an account could have a combination of 7-10+ of the above signals and still be a real, active, human being regularly logging into their Twitter account. Not impossible, just exceptionally unlikely.

Takeaways

For us, as creators of the Fake Followers tool, the fact that the estimated number was 57.1% (off the 2,000 sample followers) and the total number was 61.0% (off all 54 million) makes us feel very good about the sampling methodology.

For those who use the tool, you can feel relatively confident that numbers often line up to expectations and that sampling barely reduces the accuracy of a comprehensive analysis.

For those interested in Trump’s Twitter account, there are several items of note:

  • The spike of extremely low quality followers that fire 10+ spam flags suggests that this account in particular has been the target of a lot of suspicious activity.
  • The ease with which these accounts can be identified suggests Twitter is unwilling (rather than unable) to identify and remove obvious spam/bots/inactive/propaganda accounts.
  • Comparison against other prominent and well-followed American politicians suggests that the issue is unique to, or at least amplified, on @realdonaldtrump vs. others.
  • There is no evidence (at least in this analysis) that Trump himself or those working for/with him, have anything to do with the acquisition of these suspicious accounts. It’s certainly plausible that behavior by non-Trump-associated entities could be behind the accounts.
  • The most generous theory one might assign is that many people signed up for Twitter, filled out no information, send very few or no tweets, but followed Trump’s account and few others, then abandoned the platform. That is certainly true for at least some of these fake followers, though the triggering of some of the spam flags (use of spam-correlated keywords in particular) and the fact that Trump’s account is so much higher in fake followers than others, makes this a less comprehensive explanation.
  • Over the summer, Twitter removed a large number of spam accounts, but clearly, there are tens of millions (or more) still remaining on the platform. The singer @KatyPerry supposedly “lost” 2.8 million spam accounts in Twitter’s cleanup, while @RealDonaldTrump lost an estimated 300,000. Those numbers are surprising, given that Perry’s account appears to have far fewer obviously detectable spam followers (40.6%).

Notably, this isn’t the first analysis of that account’s followers. Newsweek wrote that about half of @realdonaldtrump’s followers were fake or inactive in 2017. Several linguistic analyses of the content have also been performed with interesting results.

One final note — while @realdonaldtrump probably only reaches ~20-25 million Twitter accounts directly, press, media, and other social media users amplify his messages to reach a broader group. But, for this account, and for others, the direct use of the published follower numbers on Twitter is a pretty poor estimate of true reach.