Data Science - Big Data DevSecOps-Security-Privacy History

Differential Privacy – 2006 AD

Return to Timeline of the History of Computers


Differential Privacy

Cynthia Dwork (b. 1958), Frank McSherry (b. 1976), Kobbi Nissim (b. 1965), Adam Smith (b. 1977)

“Differential privacy was conceived in 2006 by Cynthia Dwork and Frank McSherry, both at Microsoft Research; Kobbi Nissim at Ben-Gurion University in Israel; and Adam Smith at Israel’s Weizmann Institute of Science to solve a common problem in the information age: how to use and publish statistics based on information about individuals, without infringing on those individuals’ privacy.

Differential privacy provides a mathematical framework for understanding the privacy loss that results from data publications. Starting with a mathematical definition of privacy—the first ever—it provides information custodians with a formula for determining the amount of privacy loss that might result to an individual as a consequence of a proposed data release. Building on that definition, the inventors created mechanisms that allow statistics about a dataset to be published while retaining some amount of privacy for those in the dataset. How much privacy is retained depends on the accuracy of the intended data release: differential privacy gives data holders a mathematical knob they can use to decide the balance between accuracy and privacy.

For example, using differential privacy, a hypothetical town could publish “privatized” statistics that were mathematically guaranteed to protect individual privacy, while still producing aggregate statistics that could be used for traffic planning.

In the years following the discovery, there were a number of high-profile incidents in which data and statistics were published that were supposedly aggregated or deidentified, but for which the data contributed by specific individuals could be disaggregated and reidentified. These cases, combined with undeniable mathematical proofs about the ease of recovering individual data from aggregate releases, sparked interest in differential privacy among businesses and governments. In 2017, the US Census Bureau announced that it would use differential privacy to publish the statistical results of the 2020 census of population and households.”

SEE ALSO Public Key Cryptography (1976), Zero-Knowledge Proofs (1985)

Differential privacy addresses how to maintain the privacy of individuals while using and publishing statistics based on their data.

Fair Use Sources: B07C2NQSPV

2006, Differential Privacy

Dwork, Cynthia, and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Breda, Netherlands: Now Publishers, 2014.