The Audacity Of Poke

These are interesting times, made even more interesting by the unveiling of Facebook's new Open Graph Protocol. To say that this has not been in the works would be an omission of a grievous nature. It has been coming for a while, even if the pundits have been screaming something else from the rooftops: that Facebook – because of their closed nature – is threat to Google. There has never been greater misunderstanding of how both Facebook and Google work than to claim something like that.

That said, what I want to talk about is something else, but, before we can get to that, it is important to understand why Facebook is doing this. Much like any profit-oriented company (like Google, Microsoft and many others), Facebook wants to exist in a world where demand for its product will keep increasing by any suitable measure of growth spread over time. A purely social networking product cannot accomplish this. You have seen this with Friendster, Myspace and Orkut. Being closed, walled off and limited reduces your value in the world, because you then present the users with only limited use cases. To accomplish growth (in roughly the same terms as Google), you need to be a generalist company and not a company that focuses on the specific niche of personal connections.

Remaining a walled garden diminishes the potential of the product called Facebook. What diminishes it even more is for Facebook to remain a product that is only a facilitator of connections or virtual farming. Sure, it is a nice thing to be able to reconnect with long-lost classmates, relatives and whatnot, but those are not things you will do on a daily basis once the fad dies down. To be where Facebook wants to be at, they want to be relevant to you on a near-constant basis. The range possibilities in adding that kind of value within the previous Facebook ecosystem is rather limited. Yes, there is a lot of poring over photographs of newly overweight former friends and flames, but that has intent backing it which is of little value. In other words, it is brilliant as a time sink, but it sucks as something that is backed by little actionable/convertible/monetizable intent.

Contrary to what the pundits have been proclaiming for a long time now (that they will make Google and others bow in front of them because of their walled gardens), Facebook needs its walls to drop to increase usage and relevance. I have been saying it for a while that a closed Facebook has limited potential, what you are seeing now is the opening up of those walls to address a far larger possibility/potential. Facebook, having tired of knowing 'you' within the walled gardens now wants to know 'you' outside of it. It is an audacious move, as mentioned in the title, for a few reasons, but we will tackle that later. Before that, we need to see how we got to where we are today.

If you look at it, there has been roughly three stages in the evolution of the internet:

  1. Few Deciding For The Many: In the first phase, very few could publish on the internet. In the pre-search engine era, both content producers and curators were few in number. The old school media publications and a few portals held all the power. Lack of choice meant that you had really nowhere to go, other than where the few influentials thought you should go to.
  2. Many Deciding On Their Own: In the next phase, the era of the search engines and the aggregators, both content production and curation became far cheaper and widespread. Aided by the search engines, discovery became easier too as a result. This led to an openness of information and an odd democracy of information online, leading to disenfranchisement of the former power players like media houses and portals.
  3. Many Deciding Through The Few: What we are stepping into is now phase three, where, as a reflection of the plight of humanity, too much choice is not actually treasured by most. People feel lost when given control of their own choices, paths and destinations in life. They'd rather have it chosen for them. Thus exists the need to be driven by what our peers, friends and groups that we can easily identify with, than the urge to experience the freedom of choices as it is possible now. This is what Facebook is counting on and I am afraid they will succeed in it – though, to what extent they will succeed remains to be seen.

But, what I really wanted to talk about is something else. It is about the battle to collect, classify and identify information spread all over the internet. This is a battle that was won a while ago, with the undisputed victor being Google. Short of a closely-held (read patented) new computing paradigm that will enable a new player to crawl, index and classify the wealth of information out there at a much quicker pace, there is not much of an opportunity out there in search that won't be stillborn as a result of crashing headlong into the Google juggernaut. As Microsoft has experienced time and again, throwing more money at the search problem won't win this battle. It requires time to crawl, index and rank, and usage from the users, both being factors where Google has a massive headstart – one which with passage of time only increases it for the company

There is, though, one area where you can attack Google on this front and that is to counter Google's machine-driven apparatus with human input. After all, even the best of algorithms can't match a human being interpreting the tone and tenor of a sentence, a conversation or a query. You have seen instances of this before with various products like Delicious (classifying and categorizing links), Mahalo (human curated results for top search queries), Twitter lists (human-powered categorization of people), Wikipedia (human powered distillation and linkage to a vast array of information) and many others. But all these have suffered with either low scale of adoption or low degree of contributions. For things to work at web-scale, you need to enable people to contribute with much lesser complexity. It needs to work behind the scenes, without requiring users to understand crawling, classifying, ranking or indexing of content, silently adding value with each bit of interaction.

I'll make it even simpler. This is how search works in the present day: A computer (or many computers), goes through all the possible web pages on the internet by picking up links on all those pages. This information is then separated from all the noise that is present in it and indexed to extract the relevant information from it. At the topmost level, when a human being enters a search query, a bit of programming code parses the query, matches it to its best ability with the best result in the index and shows it to the user. Google is the master of being able do do all three stages at the most cost-effective and best manner possible at the moment. The advantage Google has is derived from three things: First, they can extract the relevant information from pages better than anyone else and secondly they can parse your query better than anyone else because of their years of experience gained from looking and matching at trillions of queries and third, the vast amount of data Google silently collect about the user from within its ecosystem (through its consumer-facing products) and outside it through products like Adwords and Analytics.

The one place where a new entrant can make a dent is to actually address this problem in such a manner that it takes the wind out of Google's sails – which is to identify and classify information at a fraction of the trouble that Google goes to get it. When you have content publishers pushing out self-identifying data (using the Open Graph Protocol), you no longer need to spend millions of dollars crawling HTML dealing with the resulting tag soup and employing multitudes of PhDs trying to extract signal from all that noise. If, in a 128 kb text, four lines will explain in a structured manner what the rest of the document is all about, it will enable others to compete with Google in the information game, because it can nullify Google's significant advantage of being able to make sense of pages stuffed with unstructured data, IF enough publishers use the Open Graph protocol in the first place.

Now, this is where the story gets interesting. So far, nothing has prevented anyone from doing what Facebook has done with the Open Graph Protocol. But, as I have mentioned, all of this has to do with the business case it represents. Till Google started getting enough scale in terms of referral traffic for content publishers, nobody gave two hoots about search engine optimization or how discoverable their content was. Facebook with its hundreds of millions of users is taking a leaf out of Google's own playbook. They are bringing massive scale into the playing field and asking content publishers to adhere to this new specification for the benefit of getting a lot of traffic from the Facebook ecosystem. The publishers, who also have their respective business cases to follow, will fall over each other to adhere to it and thus we will start on the third phase of the internet.

What will complete the picture is for Facebook to publish a framework for push notification of new content. The good news for them is that there are various formats already available to do that, including Google's own PubSubHubbub or even RSS Cloud. This will save Facebook the pain of polling for updates to all known pages and will get pages to push updates into them. At web scale, putting the push feature and the Open Graph protocol will enable Facebook to easily index content at a fraction of the cost that Google incurs in doing the same at the moment. This is also one of the reasons why Facebook has no opt out for the Social Plugins. While the technology behind it is the time-tested IFRAME tag, what it gives Facebook (once again via its users) is the ability to have a list of active web pages pushed to it than to go crawling seeking them out.

If you have been reading closely enough, you will see that there is a catch in all of this. Which is that Facebook is not really about search. It is not about relevance of search. Wait, it is not even about the open web. That is something Zuck and co is looking to change on Facebook, leading to the recent push to make 'open' the new 'closed'. As I said before, if they are left to their original intent – of making and sustaining connections – Facebook will have limited utility/value in the longer term. What they have is replaceable (not that easily, but it is not that impossible either if you remember either Friendster or Myspace). What you are seeing is Facebook's play for your identity and relevance – which is a switch from being the connector. This switch can't happen if it is presented as a choice to the users, which is why the changes are being pushed down your throat. This why you are being told that privacy does not exist on the internet and that you are better off keeping things more open than closed.

The tragedy in all of ti is that they want you to innately accept that the internet that matters to you is ONLY that part of it discovered through your friends and connections. It will fail if Facebook can't consider any member on Facebook as a connection of yours, which can only happen if everyone is forced to share a vast part of information about themselves by default. The content/pages thus discovered will retain the trail you took to reach those destinations once you abdicate discovery to Facebook. The internet will always be more than Facebook, but for those on Facebook, the company's plan is to make internet only what can be discovered through Facebook. I already know a lot of friends and acquaintances for whom a vast majority of their internet experience is limited to what they do the whole day on Facebook. It is a sad move towards a deeper descent into the plight of the modern age — of knowing less and being ignorant being something to be celebrated and cherished.

There is a larger story in all of this regarding the arbiters of information. Google has been that for a long while now. They have been far from perfect and it is a place where we need more competition than less of it. In the world that we live in, those who control the narrative wield powers that is tremendous. The audacity of poke is just another chapter in that story. Google has been not entirely free of wrongdoing in how it has handled this power, but it has handled it much better than most others I could think of. With the world becoming more and more interconnected, the powers held by the people who man the gateways to information will only increase. I am not too optimistic about where all this will eventually wind up, but for now I have decided to use Facebook only in safe browsing mode on all my browsers.

Never mind.