Thanks to Tim O'Reilly for this look at how Jeff Jonas Explores the Nature of Data
On a trip to Washington D.C., Jonas spoke with a counter-terrorism intelligence analyst at a governmental agency. "What do you wish you could have if you could have anything?" Jonas asked her. Answers to my questions faster, she said. "It sounds reasonable," Jonas told the audience, "but then I realized it was insane." Insane, because "What if the question was not a smart question today, but it's a smart question on Thursday?" Jonas says.I can see three problems here at least that Jeff can't solve and which continues to give me hope that Big Brother will drown in his own data.
The point is, we cannot assume that data needed to answer the query existed and been recorded before the query was asked. In other words, it's a timing problem. "I said, 'What are the chances you could have every smart question, every day?'"
[...] Jonas related an example of a financial scam at a bank. An outside perpetrator is arrested, but investigators suspect he may have been working with somebody inside the bank. Six months later, one of the employees changes their home address in payroll system to the same address as in the case. How would they know that occurred, Jonas asked. "They wouldn't know. There's not a company out there that would have known, unless they're playing the game of data finds data and the relevance finds the user."
This led Jonas to expound his first principle. "If you do not treat new data in your enterprise as part of a question, you will never know the patterns, unless someone asks."
[...] Getting smarter by asking questions with every new piece of data is the same as putting a picture puzzle together, Jonas said. This is something that Jonas calls persistent context. "You find one piece that's simply blades of grass, but this is the piece that connects the windmill scene to the alligator scene," he says. "Without this one piece that you asked about, you'd have no way of knowing these two scenes are connected."
[...] But large numbers can also work against you. At another federal agency (he wouldn't say which), Jonas got to thinking: What if they had a very large data warehouse in the basement with 4 exabytes (EB) of data, and it was expanding at the rate of 5 TB per minute. "You sit there and you realize you don't get to Friday night and run a batch job to answer the question of what does it all mean," he says. "You could use all the computing power and energy on Earth and you wouldn't be able to do it." The "it" he is referring to, of course, is seeing how each new piece of data affects all the other pieces of data.
"What's happening is data volumes are growing at this pace, yet an organization's ability to make sense of them isn't keeping up," Jonas said. "Today, say you can make sense of 7 percent of what's available, and in a few years it might be 4 percent, and in a few years after that it might be one percent. So the percentage of what's knowable is on the decline."
[...] "I think the only way forward is going from applying algorithms to individual transactions, to first placing information in context--pixels to pictures--and only applying algorithms after one sees how the transaction relates to the other data," he said. "It's the only way that I can see that it's going to close this sense-making gap."
- The counter terrosim expert problem is not that she can't get her questions answered, nor that, as he asserts, she needs to know what questions to ask and when to ask them. Her problem is that even when she has the right answers to the right questions, they wont be actioned unless they help with “fixing” the data around the policy
- His approach essentially requires that the organisation and its management continuously reconstruct their view of reality and constantly work without a context into which to fit the data because, by definition, the next bit of data could collapse and reconstruct that picture as something completely new. The level of self confidence that would take for people whose lives depend on structure, order, process, CYA best practise and, above all, control, is wholly at odds with the kind of people you recruit to fill those roles. Oh, and
- You still can't know everything. The data you have can only tell you so much, but the data you are not collecting is what will kill you.