A guide to data classification: confidential data vs. sensitive data vs. public information

Learn why it's important to classify your data, understand four standard data classifications, and how automation can make it easier to keep your company's data safe and compliant.

Paula Smith

Written by

Paula Smith
Paula Smith

Reviewed by

Adam Roberts
Share on Social Media
July 12, 2022
A guide to data classification: confidential data vs. sensitive data vs. public information

Finding it hard to keep up with this fast-paced industry?

Subscribe to FILED Newsletter.  
Your monthly round-up of the latest news and views at the intersection of data privacy, data security, and governance.
Subscribe Now

In the modern world, regulators expect businesses to know where their customer data is and to take action to protect it. Consumers and citizens, too, demand businesses act as stewards of the sensitive data to which they are given access.

Meeting these obligations is harder than ever when business data is spread over a growing number of structured and unstructured data sources. Not to mention the constant threat of data breaches.

To meet these challenges, businesses need to first understand the data they have, and to classify it according to business need and sensitivity.

Data classification is the process of grouping information according to needs. By classifying data, we can search more effectively, manage high-value content and protect it from unauthorized disclosure. In this post, learn why it’s important to classify your data, understand four standard data classifications, and learn how automation can make it easier to keep your company’s data safe and compliant.

What is the difference between data classification and data categorization?

While the words are often used interchangeably, data classification and data categorization are two distinct processes related to the organization of data. While data classification is focused on grouping information according to needs along dimensions such as sensitivity, data categorization is focused on making data easier to use, by assigning labels or identifiers to each category so that it can be distinguished from other types of data.

Both processes are relevant to an organization's governance efforts, but this post will focus on data classification and its role in your organization’s security and compliance efforts. 

Why classify your data?

Classifying your data is essential for several reasons. First, it helps you determine what kind of information you are storing, the value that information has to your organization, its criticality to your business process, and how it can be used if lost or hacked. In addition, once you know what type of data you are dealing with, this will help you make more informed decisions about where to store the information on your computer system and the nature of controls that are required based on classification. Finally, it also makes life easier when dealing with compliance regimes such as financial regulations and audit requirements. Information like credit card numbers and personal information (especially sensitive information) must be handled differently and with additional controls applied.

Data classification definitions

When it comes to classifying data, every organization is unique, with a particular taxonomy or ontology related to its organizational purpose, structure, and industry. These definitions are also often referred to as a file plan. However, while each organization is different, there are some commonalities.

Particular industries will have granular standards and definitions for their data types. For example, in government and highly regulated industries (financial, banks, healthcare) there are often five levels: Top Secret, Secret, Confidential, Sensitive, and Unclassified.

In addition, almost any type of data can include data classified as sensitive. Credit card numbers, bank account numbers, and driver’s license numbers are all examples of sensitive data. So are birth dates and email addresses. We'll talk more about sensitive data below.

Let's look at some common high-level ways to classify data:

  • Public data poses little-to-no risk if disclosed, as anyone can easily access it. For example, school directories, the White pages, or your business’s consumer prices would be classified as public information. This data is not considered sensitive.
  • Internal data isn’t intended for public release, though it may be accessed under Freedom of Information Act (FOIA) or similar legislative regimes. This should be assessed to gauge potential harm, though this is likely to be minimal. One example of internal data is your business’ organizational flow chart.
  • Confidential data must remain private and protected accordingly. Leaking of this kind of data (which could include Social Security numbers, medical records, bank account numbers or employment contracts) could cause serious financial, legal, or regulatory consequences.
  • Restricted data could have serious financial, legal, or regulatory consequences for your business if revealed. This classification requires additional controls and is likely subject to additional security standards. Protecting data like law enforcement records and data relating to mergers and acquisitions should be taken seriously.

What is sensitive data?

Sensitive data is usually described as any private information you must protect from loss or information that, if released, could cause damage to your organization’s reputation or operations. This includes physical and digital formats like documents, photographs, videos, or audio. Most businesses have sensitive data collected in their network and are required to follow compliance laws.

For this purpose, we can broadly define sensitive information as anything that can cause harm, embarrassment, inconvenience, or unfairness to an individual or business if it is exposed or gets into the wrong hands.

Examples of sensitive data

Sensitive information comes in many forms, but the most common are personally identifiable information (PII), and personal, or non-public personal information. We have another post on the site offering a rundown of how different privacy regulations define and protect Personal Data, Personal Information, Personally Identifiable Information and Sensitive information. PII can also be used to identify an individual and is protected under both state and federal law. In many jurisdictions, it is also carefully monitored and non-compliance punished. Examples include names, addresses, Social Security numbers, driver’s licenses, and credit cards. Medical information, bank account numbers, or passport numbers are also considered sensitive data.

PHI is considered sensitive data because it can be used to identify individuals and their medical conditions. The unauthorized release of PHI can harm the individuals involved, such as causing them to lose their insurance coverage or suffer discrimination in employment or other areas. Therefore, organizations that handle PHI are required to take steps to protect the information from unauthorized access or use.

It is crucial to protect sensitive data from unauthorized access or use. Organizations subject to compliance regimes will need to ensure that there is a security plan and the appropriate controls in place to ensure compliance and protect the information your customers, or employees have entrusted you with. You can do this by implementing security measures such as firewalls, password protection, encryption, regular penetration testing, ensuring your cloud applications have the appropriate certifications (e.g. SOC2), and importantly training your employees to handle sensitive data securely. You should back up your data regularly to ensure its safety in case of a hack or other cyberthreat.

What is confidential information?

Confidential information is data that is not accessible to everyone. It’s often private, but not always. Confidential information can be anything from your credit card number to patent applications.

Your business definition could be any information, knowledge, or data related to the operation of your business that is not in the public domain or otherwise publicly available. You designate confidentiality. Here are some examples:

  • Proprietary Information
  • Financial Information
  • Medical Information
  • Trade Secrets
  • Personal Information

An overview of data privacy laws

Data privacy laws are the rules and regulations that determine how a company may use, store, and share personal information. These laws are designed to protect against identity theft, fraud, and other crimes, ensure confidentiality among and between teams, and keep privacy worries at bay. They also control how businesses interact with customers in general by dictating methods of collecting information about and how businesses can use collected data.

By 2024, 75% of the global population will have its personal data covered under privacy regulations (Gartner). Let's look at two prominent examples for a look at how such regulation may function.

The General Data Protection Regulation (GDPR)

The GDPR is a European Union data protection law that gives EU citizens more control over their personal data. It applies to any company that targets or processes the personal data of individuals in the EU, regardless of where the company is located. Companies that process the personal data of EU citizens must comply with the GDPR which means if you have employees who are EU nationals, or you provide or leverage a product or service that can be accessed by EU citizens this applies to your business.

Businesses subject to the GDPR must secure explicit consent from individuals before collecting, using, or sharing their personal data. They must also provide individuals with clear and concise information about their rights under the GDPR, and ensure that individuals can easily exercise their rights.

HIPAA protects health information in the United States

The Health Insurance Portability and Accountability Act of 1996 (HIPAA), which applies to organizations in the United States, defines Personal Health Information (PHI) as individually identifiable health information. Protected health information includes demographic information about the individual, such as name, age, and sex; information about the individual’s health history, including mental health conditions; and the results of any tests or examinations performed on the individual.

Auto-classification means more actionable insights

Organizations of all sizes are adopting more data sources and therefore collecting more data than ever before. This influx could make it more difficult for your business to understand the data to make the right business decisions and maintain compliance. RecordPoint can help.

RecordPoint offers automated data classification that enables you to classify personally identifiable information (PII) and payment card industry (PCI) information consistently and at scale, with greater speed and accuracy.

Discover Connectors

View our expanded range of available Connectors, including popular SaaS platforms, such as Salesforce, Workday, Zendesk, SAP, and many more.

Explore the platform

Get automated categorization

Understand the data you're working with, and how best to handle it to reduce risk with RecordPoint Data Categorization.

Learn More
Share on Social Media

Assure your customers their data is safe with you

Protect your customers and your business with
the Data Trust Platform.