#[1]vice [2]alternate [3]alternate [4]alternate [5]next [6]prev

  IFRAME: [7]https://www.googletagmanager.com/ns.html?id=GTM-MSM4HQ4

  [8]SKIP TO MAIN CONTENT
    * [9]VICE
    * [10]VICE on TV
    * [11]i-D
    * [12]IMPACT
    * [13]Refinery29

  (BUTTON) United Statesen

  (BUTTON)
    * [14]Video
    * [15]Podcasts
    * [16]News
    * [17]Tech
    * [18]Music
    * [19]Food
    * [20]Health
    * [21]Money
    * [22]Drugs
    * [23]Uncommitted: Iowa 2020
    * [24]Election 2020
    * [25]Identity
    * [26]Games
    * [27]Entertainment
    * [28]Environment
    * [29]Travel
    * [30]Horoscopes
    * [31]Sex
    * [32]VICE Magazine
    * (BUTTON) More

  (BUTTON)

  Advertisement

  [33]Tech by VICE

Researchers Find 'Anonymized' Data Is Even Less Anonymous Than We Thought

Corporations love to pretend that 'anonymization' of the data they collect
protects consumers. Studies keep showing that’s not really true.

  by [34]Karl Bode
  Feb 3 2020, 3:24pm
  [35]Share[36]Tweet[37]Snap

  Image: Cathryn Virginia

  Last fall, AdBlock Plus creator Wladimir Palant revealed that Avast was
  using its popular antivirus software to [38]collect and sell user data.
  While the effort was eventually [39]shuttered, Avast CEO Ondrej Vlcek
  first downplayed the scandal, assuring the public the collected data
  had been “anonymized”—or stripped of any obvious identifiers like names
  or phone numbers.
  “We absolutely do not allow any advertisers or any third party...to get
  any access through Avast or any data that would allow the third party
  to target that specific individual,” [40]Vlcek said.
  But analysis from students at Harvard University shows that
  anonymization isn’t the magic bullet companies like to pretend it is.
  Dasha Metropolitansky and Kian Attari, two students at the [41]Harvard
  John A. Paulson School of Engineering and Applied Sciences, recently
  built a tool that combs through vast troves of consumer datasets
  exposed from breaches for a class paper they’ve yet to publish.

  “The program takes in a list of personally identifiable information,
  such as a list of emails or usernames, and searches across the leaks
  for all the credential data it can find for each person,” [42]Attari
  said in a press release.
  They told Motherboard their tool analyzed [43]thousands of datasets
  from data scandals ranging from the [44]2015 hack of Experian, to the
  hacks and breaches that have plagued services from [45]MyHeritage to
  [46]porn websites. Despite many of these datasets containing
  “anonymized” data, the students say that identifying actual users
  wasn’t all that difficult.

  “An individual leak is like a puzzle piece,” Harvard researcher Dasha
  Metropolitansky told Motherboard. “On its own, it isn’t particularly
  powerful, but when multiple leaks are brought together, they form a
  surprisingly clear picture of our identities. People may move on from
  these leaks, but hackers have long memories.”

  For example, while one company might only store usernames, passwords,
  email addresses, and other basic account information, another company
  may have stored information on your browsing or location data.
  Independently they may not identify you, but collectively they reveal
  numerous intimate details even your closest friends and family may not
  know.

  “We showed that an ‘anonymized’ dataset from one place can easily be
  linked to a non-anonymized dataset from somewhere else via a column
  that appears in both datasets,” Metropolitansky said. “So we shouldn’t
  assume that our personal information is safe just because a company
  claims to limit how much they collect and store.”

  The students told Motherboard they were “astonished” by the sheer
  volume of total data now available online and on the dark web.
  Metropolitansky and Attari said that even with privacy scandals now a
  weekly occurrence, the public is dramatically underestimating the
  impact on privacy and security these leaks, hacks, and breaches have in
  total.
  Previous studies have shown that even within independent individual
  anonymized datasets, identifying users isn’t all that difficult.

  In one [47]2019 UK study, researchers were able to develop a machine
  learning model capable of correctly identifying 99.98 percent of
  Americans in any anonymized dataset using just 15 characteristics. A
  different [48]MIT study of anonymized credit card data found that users
  could be identified 90 percent of the time using just four relatively
  vague points of information.
  Another [49]German study looking at anonymized user vehicle data found
  that that 15 minutes’ worth of data from brake pedal use could let them
  identify the right driver, out of 15 options, roughly 90 percent of the
  time. Another [50]2017 Stanford and Princeton study showed that
  deanonymizing user social networking data was also relatively simple.
  Individually these data breaches are problematic—cumulatively they’re a
  bit of a nightmare.
  Metropolitansky and Attari also found that despite repeated warnings,
  the public still isn’t using unique passwords or password managers. Of
  the 96,000 passwords contained in one of the program’s output
  datasets—just 26,000 were unique.
  The problem is compounded by the fact that the United States still
  doesn’t have even a basic privacy law for the internet era, thanks in
  part to relentless lobbying from a [51]cross-industry coalition of
  corporations eager to keep this profitable status quo intact. As a
  result, penalties for data breaches and lax security are often [52]too
  pathetic to drive meaningful change.
  Harvard’s researchers told Motherboard there’s several restrictions a
  meaningful U.S. privacy law could implement to potentially mitigate the
  harm, including restricting data access to unauthorized employees,
  maininting better records on data collection and retention, and
  decentralizing data storage (not keeping corporate and consumer data on
  the same server).
  Until then, we’re left relying on the promises of corporations who’ve
  repeatedly proven their privacy promises aren’t worth all that much.

  Tagged:
         [53]data

Subscribe to the VICE newsletter.

  ____________________
  (BUTTON) Subscribe

References

  Visible links
  1. https://www.vice.com/en_us/rss
  2. https://www.vice.com/en_us/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought
  3. https://www.vice.com/en_ca/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought
  4. https://www.vice.com/en_asia/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought
  5. https://www.vice.com/en_us/article/3a8k79/do-ring-cameras-violate-wiretapping-laws-new-hampshire-is-about-to-find-out
  6. https://www.vice.com/en_us/article/7kzxzy/senator-mark-warner-ftc-not-doing-enough-on-browsing-data-avast-antivirus
  7. https://www.googletagmanager.com/ns.html?id=GTM-MSM4HQ4
  8. https://www.vice.com/en_us/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought#main-content
  9. https://www.vice.com/en_us
 10. https://www.viceland.com/en_us?_ga=2.122564107.1244600859.1568037773-1243485207.1550599999
 11. https://i-d.vice.com/en_us
 12. https://impact.vice.com/en_us
 13. https://www.refinery29.com/
 14. https://video.vice.com/en_us/
 15. https://vice.com/en_us/page/podcasts
 16. https://news.vice.com/en_us
 17. https://www.vice.com/en_us/section/tech
 18. https://www.vice.com/en_us/section/music
 19. https://www.vice.com/en_us/section/food
 20. https://www.vice.com/en_us/section/health
 21. https://www.vice.com/en_us/section/money
 22. https://www.vice.com/en_us/section/drugs
 23. https://www.vice.com/en_us/topic/uncommitted-iowa-2020
 24. https://www.vice.com/en_us/topic/2020
 25. https://www.vice.com/en_us/section/identity
 26. https://www.vice.com/en_us/section/games
 27. https://www.vice.com/en_us/section/entertainment
 28. https://www.vice.com/en_us/section/environment
 29. https://www.vice.com/en_us/section/travel
 30. https://www.vice.com/en_us/astroguide
 31. https://www.vice.com/en_us/section/sex
 32. https://www.vice.com/en_us/topic/vice-magazine
 33. https://www.vice.com/en_us/section/tech
 34. https://www.vice.com/en_us/contributor/karl-bode
 35. https://www.vice.com/en_us/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought#javascript
 36. https://www.vice.com/en_us/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought#javascript
 37. https://www.vice.com/en_us/article/dygy8k/researchers-find-anonymized-data-is-even-less-anonymous-than-we-thought#javascript
 38. https://palant.de/2019/10/28/avast-online-security-and-avast-secure-browser-are-spying-on-you/
 39. https://www.vice.com/en_us/article/wxejbb/avast-antivirus-is-shutting-down-jumpshot-data-collection-arm-effective-immediately
 40. https://www.forbes.com/sites/thomasbrewster/2019/12/09/are-you-one-of-avasts-400-million-users-this-is-why-it-collects-and-sells-your-web-habits/
 41. https://www.seas.harvard.edu/
 42. https://www.seas.harvard.edu/news/2020/01/imperiled-information
 43. https://docs.google.com/spreadsheets/d/1A7y6Y5cgObJvoq3sIK-6K9PJ-XAaZ8QR99cD_Og-0RY/edit#gid=1989660935
 44. https://www.theguardian.com/business/2015/oct/01/experian-hack-t-mobile-credit-checks-personal-information
 45. https://www.vice.com/en_us/article/vbqyvx/myheritage-hacked-data-breach-92-million
 46. https://www.vice.com/en_us/article/78k849/hacker-breaches-porn-network-advertises-user-data-on-dark-web
 47. https://www.nature.com/articles/s41467-019-10933-3
 48. http://news.mit.edu/2018/privacy-risks-mobility-data-1207
 49. http://www.autosec.org/pubs/fingerprint.pdf
 50. https://www.cs.princeton.edu/~arvindn/publications/browsing-history-deanonymization.pdf
 51. https://www.eff.org/deeplinks/2017/10/how-silicon-valleys-dirty-tricks-helped-stall-broadband-privacy-california
 52. https://www.vice.com/en_us/article/d3agv7/the-equifax-settlement-is-a-cruel-joke
 53. https://www.vice.com/en_us/topic/data

  Hidden links:
 55. https://www.vice.com/en_us
 56. https://www.facebook.com/vice
 57. https://twitter.com/vice