What is Personally Identifiable Information (PII)

Personally identifiable information (PII) is information that, when used alone or with other relevant data, can identify an individual. PII may contain direct identifiers (e.g. passport information) that can identify a person uniquely, or quasi-identifiers (e.g. race) that can be combined with other quasi-identifiers (e.g. date of birth) to successfully recognize an individual.

BREAKING DOWN Personally Identifiable Information (PII)

Nascent technology platforms have changed the way businesses operate, governments legislate and individuals relate. With digital tools like cell phones, the internet, e-commerce, and social media, there has been an explosion in the supply of data of all kinds. Big data, as it is called, is being collected, analyzed, and processed by businesses and shared with other companies. The wealth of information provided by big data has enabled companies to gain insight into how to better interact with customers. However, the emergence of big data has also increased the number of data breaches and cyberattacks by entities who realize the value of this information. This has raised concerns over how companies handle the sensitive information of their consumers. Regulatory bodies are seeking new laws to protect the data of consumers, while users are looking for more anonymous ways to stay digital.

Sensitive vs. Non-Sensitive PII

Personally identifiable information (PII) can be sensitive or non-sensitive. Sensitive personal information includes stats like full name, Social Security Number (SSN), driver’s license, mailing address, credit card information, passport information and financial information. This is by no means an exhaustive list of what makes up PII. Companies that share data about their clients normally use anonymization techniques to encrypt and obfuscate the PII so it is received in a non-personally identifiable form. An insurance company that shares its clients’ information with a marketing company will mask the sensitive PII included in the data and leave only information related to the marketing company’s goal.

Non-sensitive or indirect PII is easily accessible from sources like phonebooks, the internet and corporate directories. Zip code, race, gender, date of birth are all quasi-identifiers and examples of non-sensitive information that can be released to the public. This type of information cannot be used alone to determine an individual’s identity. Non-sensitive information, although not delicate, is linkable. This means that non-sensitive data, when used with other personal linkable information, can reveal the identity of an individual. De-anonymization and re-identification techniques tend to be successful when multiple sets of quasi-identifiers are pieced together and can be used to distinguish one person from another.

Safeguarding PII

Several data protection laws have been adopted by various countries in order to create guidelines for companies that gather, store, and share personal information of clients. Some of the basic principles outlined by these laws state that some sensitive information need not be collected unless for extreme situations; data should be deleted if no longer needed for stated purpose; and personal information should not be shared with sources that cannot guarantee its protection.

Cybercriminals breach data systems to access PII, which is then sold to willing buyers in underground digital marketplaces. For example, in 2015 the IRS suffered a data breach leading to the theft of more than a hundred thousand taxpayers’ PII. Using quasi-information stolen from multiple sources, the perpetrators were able to access an IRS website application by answering personal verification questions that should have been privy to the taxpayers only.

PII Around the World

The definition of what comprises PII differs depending on which part of the world you're in. In the United States, the government defined "personally identifiable" in 2007 as anything that can "be used to distinguish or trace an individual's identity" such as name, SSN, biometrics information — either alone or with other identifiers such as date of birth, or place of birth. 

In the EU, the definition expands to include quasi-identifiers. These data sets will become subject to the General Data Protection Regulation (GDPR) that comes into effect in May 2018.