Protecting Patron Privacy Pythonically: Glossary

Key Points

Introduction	First key point. Brief Answer to questions. (FIXME)
Importing Data with Pandas	Use `import` to load a library and make it available to your own code Use the `help()` function to see the built-in documentation for a library Import data into python with the pandas library The `info()` function will display a summary of your imported data
Working With Data	Data imported using the pandas library is organized into a powerful structure called a DataFrame. DataFrames have many useful features for putting data to use.
PII and Other Risky Data	Personally Identifiable Information (PII) is of two types. In a library context, PII 1 is information about a patron. (E.g. name, date of birth, library barcode, etc.) PII 2 is information about your activities and other information that can be linked back to a patron. (E.g. search history, circulation records, access to electronic resources, etc.) By making connections within a pool of data, it is possible to identify specific patrons and their activities Limiting the data we collect and how long we keep it around can help mitigate these risks
Parsing Data with Functions	Write functions to efficiently run code you want to reuse. Functions can make use of other functions - those you import from libraries, as well as those you write yourself. Well written and tested functions can reliably do things that might be hard to accomplish by hand.
De-identification	De-identification is the process of removing or obscuring PII, such that the remaining information does not identify an individual. De-identified information can be re-identified, given access to the right information (e.g. the algorithm or pseudonym used for de-identification or sufficient data from other sources about the patrons in the original data). Anonymization is the process of de-identifying information in such a way that it cannot be re-identified, usually by means of statistical disclosure limitation techniques. Due to continuous advances in computation technology, full anonymity is difficult (some would say impossible) to guarantee.
Aggregation & Re-identification	Data aggregation is the process of combining data in such a way that it no longer refers to specific individuals, but rather reveals insight about groups within the population. Data which is both de-identified and aggregated can still be valuable for analysis while posing less risk to the privacy of our patrons.

Glossary

FIXME