Columbia, Google Data Researchers Find Your Secret Web Identity Isn’t as Safe as You Think

May 09, 2016

Researchers at Columbia University's Data Science Institute built an online tool—You Are Where You Go—to let people audit their social media trail.

In cyberspace, anonymity is increasingly hard to come by. In a study likely to raise privacy concerns, Columbia researchers are the latest to show how easy it is to unmask someone registered under a fake name on social media. Working with Google, the researchers demonstrated that location tagged posts on just two social media apps are enough to identify accounts held by the same person. The team presented its results at the World Wide Web conference in Montreal in April.

Previous research has shown that individual online shoppers, taxicab riders and Netflix subscribers are all easily identified in data sets that have been stripped of people’s names. The new research takes those findings a step further by showing that any data set with location information— credit card purchases, Foursquare check-ins or Instagram posts—can be easily linked to other data sets to identify someone

“People are now sharing their location on a growing number of apps, often without realizing it. Companies no longer have to be very sophisticated to access this data and use it.”

 

“Many people choose not to identify themselves online,” said study coauthor Augustin Chaintreau, an assistant professor of computer science at Columbia Engineering and a member of the Data Science Institute. “If I now tell you that your location data makes you recognizable across all of your accounts, how does that change your behavior? This is a question we now have to answer.”

The team’s algorithm compares geotagged Twitter posts with posts on Instagram and Foursquare to link accounts held by the same person, even if they have been stripped of biographical information. It calculates the probability that one person posting at a given time could also be posting in a second app at another time and place. Using a similar method, the researchers also could identify shoppers by matching anonymous credit card purchases against logs of their mobile phones pinging the nearest cell tower.

Of all the digital traces we leave in daily life, location metadata may be the most revealing. People’s real world movements are so distinctive that many users can be identified from surprisingly few data points. Even when people adjust their privacy settings, or fill in profile information incorrectly, with a fake name or age to disguise their identity their location information may give them away, the study concludes.

How much information is leaking out? The researchers devised a way to find out. Two undergraduates, Daniel Echikson at Columbia College and Stephanie Huang at Columbia Engineering, with study coauthor Chris Riederer, a graduate student in computer science at Engineering, built an online tool—You Are Where You Go—to let people audit their social media trail. With a few clicks, the tool retraces the user’s steps on Twitter, Instagram and Foursquare. A few simple algorithms then process this information to make relatively accurate inferences about the user’s age, ethnicity, income, and even whether or not they have children.

Location tracking data is now embedded in phones and many apps precisely because it’s so useful. It’s what allows users to get accurate directions instantly, learn that a friend is unexpectedly nearby, or that a store in the neighborhood is offering a promotion. These perks, however, come with large privacy risks that remain poorly understood.

“People are now sharing their location on a growing number of apps, often without realizing it,” cautions Riederer. “Companies no longer have to be very sophisticated to access this data and use it for their own purposes.”

— By Kim Martineau, Data Science Institute