Keren Gu's Picazón

Sharing and Scratching Itches Everywhere

Sharing Personal Data with No Fear


This post is inspired by a NYTimes article.

Something worth pondering is a method that could allow companies like Facebook to share their data without revealing any personal privacy.

There are a lot of problems with both sharing and not sharing Facebook’s (and other companies alike)’s data. As users, we don’t want people to look into these data and find things pertaining to us individually. As scientists, we want transparency and the release of any research data for verification. For the sake of science.

While working at the Mobile Experience Lab (@MIT Media Lab) with Avea, a mobile company of Turkey, on methods of identifying “power users” in their mobile network, our research ability was severely limited by the lack of real data to analyze. Our collaboration also became very tedious. The workflow included generating random data in order to test our algorithm. Yet we would have nothing to look at, no patterns to find once our algorithm is ran.

However, in a pure theoretical point of view, there need to be a way to allow scientists to analyze user data without the ability to pin down a particular individual.

On a very high level, I see two different ways to do this. One is to eliminate information such as name, address, phone number when these data are being sent for analyze. Another is a method that restricts researches to analyze individual data, only allow experiments to run on batches of data. Any attempt to hone down on a specific person or small group of people will result in inaccurate data.

As for the first suggestion, there is a clear problem. For most of us, our name, address, phone number are our identifications. But for others, they could be in a company of 1 or few, thus their company became another identifying. This also extends to their network, their selection of “likes”. Their statuses may reveal things about them. By this argument, we soon eliminates almost all information about a person, and therefore sharing no data. It is clear that this will not work.

As for the second suggestion, there could be hope. If we can add a virtual layer between data, and researchers — a layer that encrypts or smudges the data so it’s impossible for us, or our programs to align individual data and find anything particular about a single person, or a small group of people, then we will be close to fixing the tension between companies, users, and scientists.

I haven’t thought enough to figure out if this is possible or not. I could totally be bluffing.


2 thoughts on “Sharing Personal Data with No Fear

  1. In a large set of data, you wouldn’t need names or phone numbers, so they would have to be stripped off the data to begin with. The second idea you mentioned would become the only way of doing the research. With a small set of data, what could work is that if social networks allow a person to completely hide themselves so that even if one person has liked a certain fan page and he is the only one who liked it, he is able to hide himself so that information is not shown. That way data on a small scale can be used. What do you think?

    • Thanks for the comment! I was thinking that on a small scale, there could be a set of fake data that doesn’t match to any real person. We did something like this while working with Avea. This will help debugging algo, not for drawing any kind of conclusion. =)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s