This lesson is still being designed and assembled (Pre-Alpha version)

Aggregation & Re-identification

Overview

Teaching: 10 min
Exercises: 10 min
Questions
  • If de-identified data can be re-identified, and anonymization is hard to guarantee, how can we protect our patrons’ privacy?

Objectives
  • Summarize and aggregate data about individual patrons into data about the population

  • Evaluate the risk of re-identification by looking at the size of the smallest sub-populations described

FIXME

Key Points

  • Data aggregation is the process of combining data in such a way that it no longer refers to specific individuals, but rather reveals insight about groups within the population.

  • Data which is both de-identified and aggregated can still be valuable for analysis while posing less risk to the privacy of our patrons.