The "Mosaic Effect"...
A few months ago I had the privilege to sit in on a presentation by an architect at the data.gov website. It was quite an interesting presentation and I was very glad I took some time out of my schedule to see what they were doing, where they started and what was next. During the question and answer portion of the presentation, almost every single question asked to the presenter was about the quality of the data available on the site or about the security aspect of making all this data available to anyone with an Internet connection.
The presenters answers to the data quality questions were almost exclusively centered around the fact that data available on the site was provided by the individual agencies and the individual agencies needed to be responsible for handling the quality of the data. In short, data.gov was just making the data available, if you had a problem with the data, go see the folks who created it. Makes some sense I guess, if they got hung up on worrying about all the data content they would never have gotten this far, not sure I agree with it, but I guess it makes sense. I’ve mentioned in previous posts that I wished there was some way to provide feedback and viola, they’ve recently added the ability to “rate” the data sets by four ratings (overall utility, data utility, usefulness and ease of access).
Answers to the security related questions were certainly more thoughtful. The presenter noted that all the data sets available on data.gov were readily available at the different U.S. Government agency websites already, all data.gov was doing was making them much more easily accessible. It was clear however that security was on the mind of the presenter, the tone of the conversation became much more serious during this part of the discussion. What was the primary security concern of the presenter? It wasn’t that an agency was going to post a single data set which would compromise security, he didn’t think anyone was that foolish (this was pre-WikiLeaks). The primary concern was that someone could merge multiple data sets together, piece them together if you will, to build a new data set which would in fact compromise security. The presenter called this scenario the “Mosaic Effect” and defined it as follows (paraphrased): “The Mosaic Effect is when seemingly innocent (the presenter used the word innocuous) bits of data while by themselves are not a security concern, may reveal secure information when combined.”
In 2004, ComputerWorld defined the Mosiac Effect as (paraphrased): “How a combination of seemingly innocuous bits of data can create a privacy breach when combined”.
In an statement on the CIO.gov website in early 2010, Vivek Kundra stated: “Individual pieces of data when released independently may not reveal sensitive information but when combined, this “mosaic effect” could be used to derive personal information or information vital to national security.”
The “Mosaic Effect” hit home recently when a friend sent us a link to an online phonebook which aggregates data about people and makes it available via an online search engine. There are probably dozens of sites now which do this but this particular site provided the following bits of information on my name: address, approximate age, approximate household income, approximate home value, hobbies, other household members, YIKES... Apparently they knew more about me which they were willing to share if I was willing to cough up $36 a year for full access to their database and search capabilities.
I’m not a privacy nut but we’re moving into an age where just about any information you want to find out about a person could be found out with an Internet browser. Data aggregation websites are exploiting this “Mosaic Effect” culling data together from social networks, online auctions, online real estate and tax databases and wherever else they can get grab information about people. Like it or not, sooner or later you won’t have to ask someone “boxers or briefs”, you’ll pay $36 to some random data aggregator and you’ll find out the person’s waist size too.
Until next time...Rich