Close ☰
Menu ☰

Data: surprisingly reliable, or reliably surprising?

Posted on: Friday 20th of July 2012

I’ve been catching up a little on underlying theories of information. Here’s what I found.

James Gleick, who also wrote Chaos, is a brilliant writer. His book The Information, is a fascinating read but is ultimately disappointing because everything he says is based on theories of Claude Shannon.

Shannon was a brilliant engineer who did much to make today’s information age possible. As an engineer he was focused on getting a message from A to B as accurately and efficiently as possible. He didn’t care what it said. So his theory of information said that to measure information all you need to do is count the bits (ones or zeroes) it contains. Thus, according to Shannon, Shakespeare’s Hamlet and a monkey battering away on a typewriter produce exactly the same amount of information if they produce the same number of words/bits. Not very enlightening.

I have to admit, I really struggled to get my head around Decoding Reality: The Universe as Quantum Information by Vlatko Vedral. It’s basic argument is that everything, from physics and biology to sociology and economics, can and is being rewritten from the point of view of information processing. Here are some quotes from the book:

“Any problem in nature can be reduced to a search for the correct answer amongst several (or a few million) incorrect answers. Searches are so ubiquitous they range from you searching for a file on your computer to a plant searching for a molecule to convert the sun’s energy to useful work.”

Everything as search … Wow! Here’s another:

“Anything that exists in this Universe, anything to which you can attribute any kind of reality only exists by virtue of the mutual information it shares with other objects in the Universe.”

OMG! I share information, therefore I am!

My third attempt was Information: The New Language of Science by Hans Christian von Bayer. This is much more interesting. Von Bayer is also interested in quantum information (which, I have to admit, I still don’t understand). Apparently, according to the theory of quantum information, Shannon’s approach to information is just one of an infinite number of different ways of measuring information. No one, single, correct answer. That’s intriguing.

He then focuses on meaning. To make a bet a bookie needs a measure of the quality of the information he’s getting. High quality information is reliable. The bookie is concerned with the probability that any piece of information he gets is correct. A tip that is 100 percent sure is worth betting everything on. A tip that’s 50 percent sure is worth nothing (no better than a coin toss). Strangely, a tip that’s 100% sure to be wrong is also worth a lot because its reliability gives you the basis for making a decision. So that’s one way of measuring the value of information: by how reliable it is; how far you can trust it to be accurate.

Another way of measuring the value of information is by what von Vayer calls surprise. If you have a two headed coin and you toss it, you have zero surprise if it comes up heads and 100 percent surprise if it comes up tails.  Real coins are set at 50 percent – an equal chance of head or tail.

In this context, the more surprising a piece of information is, the more valuable it is.

Now, here we have two approaches to valuing information, and they lead us in opposite directions. The reliability creates value by eliminating surprise. Surprise, on the other hand, tell you something new; something that changes what you do.

Reliability and surprise.

These seem to me to be really powerful ways of understanding the value of information, especially when we consider them in combination.

You could say, for example, that the ‘Big Data’ is all about the search patterns and trends that you hadn’t seen before; which you can use as a basis for decision-making. Big Data is the search for surprisingly reliable information: you didn’t see what you could before, but now you know you can trust what it’s telling you.

Volunteered Personal Information on the other hand is reliably surprising. VPI is always telling you new stuff, things that change your understanding of something; that change what you do. And it’s the fact that it is reliably surprising that makes it valuable.

So here’s a question: when it comes to the thorny question of how to value information, is reliability and surprise (and their different combinations) a viable way forward?

Alan Mitchell