Encryption Is Not The Answer

The strangest reaction about the entire privacy mess that is unraveling is the quest for even stronger encryption. If you ask me, that is trying to solve the problem at the wrong end. We can, in theory, go in for unbreakable-grade encryption and hope to keep everything away from prying eyes. That would have been fine if the problem we are dealing with was limited to having an expectation that your communications will be private by default.

The question we need to ask is if all this snooping actually delivers the results that we are looking for. And it is a particularly tricky one to answer as a certain amount of force, once applied, will produce at least minimum level of results. Any sort of enforcement, once deployed, will result in at least some impact on crime anywhere in the world. So, yes, you will catch at least a few bad guys by doing things to catch the bad guys.

Thus, things turn a bit more nuanced than the binary “will it” or “won’t it”. It also becomes a question of efficiency and effectiveness. And this is where the tenuous contract between the state and its subjects comes into play. Historically, this has always been a clearly understood tradeoff. In exchange for giving up absolute freedoms and an absolute right to privacy, the state provides you a stable and secure socio-economic environment.

You Are With Us, But The Data Says You Are Against Us

The efficiency and effectiveness of any system is not always determined by how wide a coverage can the system aim to accomplish. A good example of this is prohibition. That system worked by outlawing production of alcoholic beverages. The coverage was complete, yet, it was hardly foolproof and led to other major problems. In instances like these, the contract is greatly strained and other than the exceptions of war or episodes of tyrannical rule, it inevitably breaks.

The power of any state, especially democratic ones, is drawn heavily from allowing the majority of the population to feel the state looks after their best interests. This keeps the state and the subjects on the same lines of the divide, even when the state has always been more powerful than the individual. This works well only when the system assumes that the majority of the participants are good people, with reasonable margin for error.

The same tradeoff, in free societies, allows you to keep knives at home without suspected of being a killer, even as many (albeit a smaller number) have killed others using a knife. If one fine morning, the state starts treating anyone who has a knife as a potential killer, the system will eventually break down. A state’s power may be considerable, but it is still a power granted by the majority of its subjects. The moment a state makes almost all of its subjects suspects in crimes that may or may not happen, the contract breaks and it breaks for good.

If you concern yourself with the systems — the design or study of it — one that will stand out before long is that there is no perfect system or law. The best ones are the ones that aim to get it wrong the least number of times, with allowances for fair redressal than the ones that aim to get it right all the time and try to be absolute. In a healthy system, the subjects don’t have an expectation that the state will always be right and the state does not have an expectation that the subjects are always wrong. This is what keeps the tradeoff a viable option for both parties and like any good bargain, it requires both parties to behave within expected lines.

A healthy system is less likely to punish the innocent, even at the cost of letting more of the guilty escape punishment.

The breakdown aside, there is the question of efficiency. Systems that try to examine every interaction will always provide the initial rounds of success. Over time, though, the participants in any evolving system (consciously or sub-consciously) adapt to the examination and soon you have a system that tracks everything, yet it catches nothing as you have now given the majority of the population an incentive to be evasive (for the fear of wrongful prosecution). It is easier to find 50 bad apples in a batch of 200 than it is to find them in a batch of 200,000.

In one fell swoop, you have made every subject a potentially bad person, leaving the utterly distasteful task of proving the negative as the default. Even if you ignore the issue of false positives, such systems are impossible to sustain over longer periods of time as they get more and more expensive through time, while becoming less efficient.

Role Of Computing

Major developments in computing in the new millennium can be broken down into two things. First is the ability to capture vast amounts of data. Second is the ability to process them in parallel and find patterns in them. Collectively, we have come to call this “big data” inside and outside tech these days.

We have always had the ability to capture data. The concept of accounting itself is almost as old as the human civilization. Data has always been part of our lives; it is only the extent of the data that was captured that has grown over time. Given enough resources, you can capture pretty much everything, but data itself is worthless if you can’t process it. This is the reason why we never thought much of big data until now.

One of the greatest silent revolutions of computing in the past decade has been the shift from identification through the specific to identification through patterns. In the late 1990s, when the internet was taking its baby steps to becoming the giant it is today, the identification of you, as an individual, was dependent on what you chose to declare about yourself.

There were other subtle hints that were used, but most of anyone’s idea of who you were was dependent on what you chose to disclose. Over time, that changed to observing everything you would do and figuring out really who you were likely to be, based on the actions of a known group of people whose actions match your actions, even if what they have declared about themselves have nothing in common with what they system has decided they are about.

In daily life, you see this in action in contextual advertising and recommendation systems. In fact, almost the entire sub-industry of predictive analysis depends on making inferences such as these. This, aided by the vast amount of public data that we produce these days, has meant that profiling a person (provided there exists a vast amount of profiled known data) as of a particular type can now be done in seconds, compared to weeks or months earlier.

“If he looks like a traitor, walks like a traitor, and talks like a traitor, then he probably is a traitor”

The above line could easily fit how any overly suspicious state thinks of its subjects, but it is just an adaptation of the most famous example of inductive reasoning called the ‘Duck Test‘. The earlier concept of knives in societies will make a lot more of sense when seen in the light of this test and big data.

Even in earlier times, we could collect all information about every knife made and sold in a country, but mining useful intelligence out of it was a hard job and even harder was to get it done at a reasonable speed. After all, there was no point in finding out now that Person A, who bought a knife 6-months-ago, was likely to commit murder, which he in fact did 4-months-ago.

The advances in computing now enable us to predict who all are likely to buy a knife in the next four months and given the profile of activity of murderers in our records, we can also predict who, of the lot of knife buyers in the last three moths, all are likely to commit murder in the coming months, at what time of the day and which day of the week.

That has to be a good thing, right?

Not really.

How Wrong Does Wrong Have To Be To Be Really Wrong?

If you are smart, the truth that you quickly learn from dealing with large amounts of data is that it is an imperfect science. The science is good enough to build an advertising business that will wrongly recommend tampons to someone who is very much a male or wrongly suggest an ex-husband as a potential mate on a social networking site; but it is nowhere close to being good enough to identify potentially bad people, based on patterns and inferences.

If we go back to the earlier point about what constitutes a good system — something that gets it wrong least number of times, systems that are built on aggregating data (or metadata) are terrible ones. It is not that these systems don’t get it right; they do and probably even to the extent of 70-80% of the times, but they also get it terribly wrong the other 20% of the time. When you get an advertising or recommendation system wrong, it causes a bit of embarrassment and maybe much ire, but you get a surveillance system wrong and you wind up putting way too many innocent people behind bars and destroy their lives.

People who work with big data in advertising and other online operations will be the first ones to tell you that these systems need constant tweaking and that they’re always prone to known and unknown biases based on the sampling and collection. In working with big data sets, the first assumption you make is that you are probably seeing what you want to see as what you are collecting often has the bias of your desired outcome built into it.

The Sordid Tale Of Bad Outcomes Born Of Good Intentions

With all of these flaws, why is there this major attraction in intelligence, law & enforcement communities to wholly embrace these flawed technologies? The answer lies in how the nature of conflict has changed in the 21st century.

Once upon a time, wars were simple affairs. A strong army would, more often than not, decimate a weak one, take over the lands, wealth and people of the defeated and expand their kingdom. These used to be pretty isolated and straightforward affairs.

Modern warfare bears little resemblance to any of that. For one, absolute might has become of less relevance in these times. The fear of a lone bomber these days cause more invisible damage than an actual bomb that kills many. This asymmetry has brought about a substantial shift in placing an absolute importance on prevention than retaliation.

The good intention is prevention. The bad outcome is all the snooping and data collection.

Enforcement and intelligence, anywhere, loves preventive measures. The fine balancing act of imprisoning 20 innocents to catch two really guilty to save 20 million has always been a debate that rarely finds a conclusion that is agreeable to everyone.

What makes the outcome so dangerous is that such profiling is based on actions that are performed by the majority of the population who have absolutely nothing in common with a person looking to blow up something.

Problem is that drawing such inferences gives enforcement and intelligence a magical shortcut to identifying subsets of people who can be further investigated on the basis of their belonging to the same bucket of people. Given how the inferences are made, it is easy to be bucketed in the same group if you have the same usage profile on a handful of harmless websites as a known suspect has.

And given the fact that pretty much everyone would have done something that’s not entirely right at some point in their lives, this also opens up a vast avenue for abuse by an overactive arm of enforcement, purely based on suspicion than any actual fact.

More Encryption Is Not The Answer

Coming back to where we started from, the fact is that encrypting anything and everything does not keep you safe from any of this. In fact, using so much of encryption will probably identify you as someone suspicious from the outset and that suspicion can be used to procure permission that will either force you or organizations that are intermediaries (ISPs, web hosts, the list is endless) to cooperate.

Another reason why encryption fails is this: even on a fully encrypted loop, if the other party you are communicating with is susceptible to pressure, all that is required is for the other party to silently cooperate with whoever is investigating them. That requires no brute forcing or any other fancy tech. It just requires a single weak link the chain and, unfortunately, the chain has many weak links.

In conclusion, the problem at hand is not a quandary that is technical in nature. It is one that is about the relationship between the state and its subjects. In a rather strange twist of fate, this is exactly what the terrorists wanted the modern state to become — one that lives in fear and lets that fear oppress its subjects.

Once we reach that point it is a endless slide down the rabbit hole and I am afraid we won’t realize the extent of that slide before a lot more of damage is done.