Home > data science, modeling > Twitter and its modeling war

Twitter and its modeling war

November 26, 2013

I often talk about the modeling war, and I usually mean the one where the modelers are on one side and the public is on the other. The modelers are working hard trying to convince or trick the public into clicking or buying or consuming or taking out loans or buying insurance, and the public is on the other, barely aware that they’re engaging in anything at all resembling a war.

But there are plenty of other modeling wars that are being fought by two sides which are both sophisticated. To name a couple, Anonymous versus the NSA and Anonymous versus itself.

Here’s another, and it’s kind of bland but pretty simple: Twitter bots versus Twitter.

This war arose from the fact that people care about how many followers someone on Twitter has. It’s a measure of a person’s influence, albeit a crappy one for various reasons (and not just because it’s being gamed).

The high impact of the follower count means it’s in a wannabe celebrity’s best interest to juice their follower numbers, which introduces the idea of fake twitter accounts to game the model. This is an industry in itself, and an associated arms race of spam filters to get rid of them. The question is, who’s winning this arms race and why?

Twitter has historically made some strides in finding and removing such fake accounts with the help of some modelers who actually bought the services of a spammer and looked carefully at what their money bought them. Recently though, at least according to this WSJ article, it looks like Twitter has spent less energy pursuing the spammers.

It begs the question, why? After all, Twitter has a lot theoretically at stake. Namely, its reputation, because if everyone knows how gamed the system is, they’ll stop trusting it. On the other hand, that argument only really holds if people have something else to use instead as a better proxy of influence.

Even so, considering that Twitter has a bazillion dollars in the bank right now, you’d think they’d spend a few hundred thousand a year to prevent their reputation from being too tarnished. And maybe they’re doing that, but the spammers seem to be happily working away in spite of that.

And judging from my experience on Twitter recently, there are plenty of active spammers which actively degrade the user experience. That brings up my final point, which is that the lack of competition argument at some point gives way to the “I don’t want to be spammed” user experience argument. At some point, if Twitter doesn’t maintain standards, people will just not spend time on Twitter, and its proxy of influence will fall out of favor for that more fundamental reason.

Categories: data science, modeling
  1. cat
    November 26, 2013 at 11:07 am

    The Network Effect will keep twitter relevant. You are unintentionally contributing to it as well since your four social media share buttons include twitter so I guess twitter can’t be that bad.

    Unless someone makes a distributed twitter, think bittorrent or freenet, and also gets millions of people to buy-in we will have to put up with getting spammed. If you have to spend millions a month on operational costs you have to have a business model.


  2. Zathras
    November 26, 2013 at 11:31 am

    Zathras’s Rule #1 of Analytics: To be measured is to be gamed.

    Quantity is easy to measure. Quality is not. Any discussion of bots following relevant actors will be made at the first level of measurement. Higher level executives will never process the discussion. There is an aspect of higher exec’s that prevent them from tolerating ambiguity in the data. They place absolute trust in the data they receive, since they expect to make the most arbitrary decisions off of it. Fuzzy concepts of quality are lost on them. The reputation of Twitter will not suffer, since higher exec’s will never understand this nuance. All they see will be the quantity.


  3. Alex
    November 26, 2013 at 12:11 pm

    @cat “unless someone makes a distributed twitter”, you mean status.net?


    • cat
      November 26, 2013 at 3:43 pm

      No. Distributed as in each node shares content without the aid of a central server. status.net/pump.io is still client/server even though the servers can inter-operate.
      See Freenet or DHT bitorrent.


  4. November 26, 2013 at 2:21 pm


    What if follower counts weren’t available? Implicit impact would be the only impact. Of course, then it would be a lot like email…


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: