Home > data science, modeling > The endgame for PageRank

The endgame for PageRank

March 18, 2014

First there was Google Search, and then pretty quickly SEOs came into existence.

SEOs are marketing people hired by businesses to bump up the organic rankings for that business in Google Search results. That means they pay people to make their website more attractive and central to Google Search so they don’t have to pay for ads but will get visitors anyway. And since lots of customers come from search results, this is a big deal for those businesses.

Since Google Search was based on a pretty well-known, pretty open algorithm called PageRank which relies on ranking the interestingness of pages by their links, SEOs’ main jobs were to add links and otherwise fiddle with links to and from the websites of their clients. This worked pretty well at the beginning and the businesses got higher rank and they didn’t have to pay for it, except they did have to pay for the SEOs.

But after a while Google caught on to the gaming and adjusted its search algorithm, and SEOs responded by working harder at gaming the system (see more history here). It got more expensive but still kind of worked, and nowadays SEOs are a big business. And the algorithm war is at full throttle, with some claiming that Google Search results are nowadays all a bunch of crappy, low-quality ads.

This is to be expected, of course, when you use a proxy like “link” to indicate something much deeper and more complex like “quality of website”. Since it’s so high stakes, the gaming acts to decouple the proxy entirely from its original meaning. You end up with something that is in fact the complete opposite of what you’d intended. It’s hard to address except by giving up the proxy altogether and going for something much closer to what you care about.

Recently my friend Jordan Ellenberg sent me an article entitled The Future of PageRank: 13 Experts on the Dwindling Value of the LinkIt’s an insider article, interviewing 13 SEO experts on how they expect Google to respond to the ongoing gaming of the Google Search algorithm.

The experts don’t all agree on the speed at which this will happen, but there seems to be some kind of consensus that Google will stop relying on links as such and will go to user behavior, online and offline, to rank websites.

If correct, this means that we can expect Google to pump all of our email, browsing, and even GPS data to understand our behaviors in a minute fashion in order to get at a deeper understanding of how we perceive “quality” and how to monetize that. Because, let’s face it, it’s all about money. Google wants good organic searches so that people won’t abandon its search engine altogether so it can sell ads.

So we’re talking GPS on your android, or sensor data, and everything else it can get its hands on through linking up various data sources (which as I read somewhere is why Google+ still exists at all, but I can’t seem to find that article on Google).

It’s kind of creepy all told, and yet I do see something good coming out of it. Namely, it’s what I’ve been saying we should be doing to evaluate teachers, instead of using crappy and gameable standardized tests. We should go deeper and try to define what we actually think makes a good teacher, which will require sensors in the classroom to see if kids are paying attention and are participating and such.

Maybe Google and other creepy tech companies can show us the way on this one, although I don’t expect them to explain their techniques in detail, since they want to stay a step ahead of SEO’s.

Categories: data science, modeling
  1. March 18, 2014 at 9:32 am

    The problem here is that I don’t see how tracking all of this behavourial data will actually benefit Google Search users, in addition to being a bit creepy. Under these conditions, when I search for something, will I find something “organically”, or will I just find what I’m “supposed” to. Neutrality is a major feature of search, one that apparently might be going away.

    Also, as far as sensors in the classroom to evaluate teachers: maybe we should reconsider optimizing everything and just go with low-tech feedback approaches. Or not stressing out everyone under the age of 17 all the time.

    Great post, BTW!

  2. March 18, 2014 at 10:20 am

    SEO kind of disgusts me, but I can understand that it’s a sort of prisoner’s dilemma. I call PageRank an ‘honest algorithm’ in this sense. If nobody tries to game it, everyone is better off because links really do form good proxies for page quality. But once one person tries to game it, everyone else is screwed unless they follow suit.

  3. CitizensArrest
    March 18, 2014 at 3:11 pm

    I found this part to be mostly on target, though we do know a lot about what in fact makes a good teacher: “Namely, it’s what I’ve been saying we should be doing to evaluate teachers, instead of using crappy and gameable standardized tests. We should go deeper and try to define what we actually think makes a good teacher,” BUT, you totally went off the rails here: “which will require sensors in the classroom to see if kids are paying attention and are participating and such.” SENSORS? Like Gates galvanic response bracelet idea??? In other words more tech and measurement full time in the classroom? There is a much better and well proven way of doing this using the best sensors out there, human beings. After all, it will be humans who will be viewing all the “data” collected by sensors, correct? Why spend money, bandwidth and maintenance time on tech when you can eliminate the unnecessary tech middle man and just send trained observers into the classroom like this article describes? http://www.nytimes.com/2011/06/06/education/06oneducation.html?pagewanted=1&_r=1&hp As the article shows, teachers and education professionals are fully capable of policing their own if you trust and empower them to do so. It may come as a shock to some folks, but teachers are professionals who deeply care about the quality of their work and they do not want those who don’t share that commitment.to be in their schools.

    • March 18, 2014 at 3:57 pm

      Agreed that trust would be even better. What I’m proposing is what they claim is required with respect to costs.

  4. Guest2
    March 18, 2014 at 3:44 pm

    Does this mean a search topology shift? I don’t know what the underlying topology is for the search algorithm (that is, what it looks like mathematically), but is this anticipated to change as well?

  5. March 19, 2014 at 3:15 pm

    lol “….google and other creepy tech companies….” … have you blogged on the nsa lately? gotta go look that up….
    but anyway yeah its an arms race, a darwinian race.its under continual cyberspatial evolutionary pressure.
    re matt ridley “red queen race”

  6. March 19, 2014 at 3:45 pm

    You wrote: “We should go deeper and try to define what we actually think makes a good teacher, which will require sensors in the classroom to see if kids are paying attention and are participating and such.” My thought: if we expect our graduates to be self-actualized learners instead of well-trained regurgitators we need to redefine teaching altogether and in doing so we would also have to redefine “what we actually think makes a good teacher”…. and in doing so I would contend we could abandon the idea of monitoring kids electronically. The last thing we need to do is add any more surveillance into the lives of our children. Already they believe metal detectors, locked doors, smothering adult oversight, and cameras are a part of everyone’s life. If the NRA gets its way we will be able to add “armed guards” to that list… and the last thing we should add is “…sensors to see if kids are paying attention”.

  7. March 20, 2014 at 1:21 am

    The world becomes a mixture of 1984, THX1138, and Soylent Green. Robinson Crusoe must seem the only sane approach.

  1. March 19, 2014 at 9:53 am
Comments are closed.
Follow

Get every new post delivered to your Inbox.

Join 976 other followers

%d bloggers like this: