Why data minimization matters

Holding on to data you're not required to keep can have serious implications for your organization. Learn how data minimization can help you reduce data risk.

Paula Smith

Written by

Paula Smith

Reviewed by

Share on Social Media
September 5, 2022
Why data minimization matters

Finding it hard to keep up with this fast-paced industry?

Subscribe to FILED Newsletter.  
Your monthly round-up of the latest news and views at the intersection of data privacy, data security, and governance.
Subscribe Now

In a world of cheap, effectively infinite cloud storage, it can feel like there’s simply no need for the delete key. After all, we can keep everything forever, pay next to nothing per item, and never worry about it again, right? Well, not exactly.  

Data has costs that go beyond the infrastructure bill — and retaining it forever risks impacting your organization in unexpected ways. The cost/benefit calculation isn’t as straightforward as you might think.

In this article, we’ll outline the hidden (or not so hidden) costs of retaining data. We’ll also be exploring how a focus on data minimization can both reduce these costs, and enable your team to get more out of the data you do keep.  

But first, let’s take a good look at ROT: the data we don’t want to hold on to.

What is ROT?

Redundant, Obsolete or Trivial data is a set of classifications for data that organizations don’t need to keep but continue to retain. This data naturally accumulates over time in normal business operations. Employees create ROT when they save multiple copies of the same information (Redundant data), retain out-of-date information (Obsolete data), or save irrelevant or personal information to their work devices or drives (Trivial data).  

A solid data minimization process can help your business mitigate the risks that come with holding on to ROT data.  

Calculating risk

The argument for ‘keeping everything’ usually revolves around the perception that data should be retained because, ‘we might need that one day’. The reality, however, is that keeping information too long can raise many types of risks to your business:

  • Regulatory risks – Most jurisdictions with privacy regulations have a provision that personal information is not to be kept any longer than is necessary for the purpose it was obtained. Keeping any amount of personal information on your system ‘just in case’ is not a defense here and can cause significant financial penalties and reputational damage.
  • Security risks – Keeping outdated file formats can open the organization to additional data security risks. As those formats are no longer maintained or patched, they may enable malicious actors to orchestrate a data breach, or open the organization up to more general system failures.
  • Process risks – The more content you have, the more inefficient your business becomes and the greater the risk that you’re basing decisions on outdated or incorrect information. This can cause legal and financial damage and bring reputational damage along with it.

Calculating cost

We often think of cost as a relatively simple $X per TB ratio. However, when calculating the true cost of storage, you also need to look at other dimensions, including:

Replication costs - Some organizations have clear requirements to replicate data across regions, typically to improve availability or disaster recovery processes. If you’re replicating critical datasets across different geographical regions, you’re effectively doubling the amount of data stored. You should expect to incur additional costs as a result, possibly up to 100%

Transfer or egress costs - If you transfer data from your storage provider, either as part of a business process or as part of a larger data migration, you can incur additional costs.

Management costs - These could be charged by your provider or could come as indirect costs that your team incur as they action data management tasks. It covers a broad range of items, including the cost of transferring data across tiers of storage (for better cost effectiveness), cloud monitoring to ensure data integrity, security activities (encryption, penetration testing, security architecture changes etc.)

While you may be paying $0.023 per TB of data, that number may pale in comparison to the direct and indirect costs above. Even worse, this investment may going toward the management of content that you don’t even need (or want) to keep and which if retained, could cause your organization harm.

Overcoming the fear of hitting delete

So, what can you do?

At its heart, the solution centers around developing a better understanding of your data, separating the wheat from chaff if you will – data minimization.

Only once you do that can you appropriately minimize the amount of data you are keeping on an ongoing basis. You need to remove the ROT.

A guide to removing ROT

Now that you understand the issues ROT can cause, you can start the process of removing it.  

Here’s a high-level guide to removing ROT so your organization can begin achieving its business objectives.  

  1. Understand your data. What do you have across your data corpus? What formats, date ranges exist? Can you determine the purpose/process it was collected for?
  2. Understand its value. Not all “old” data is without value. But you may have a significant quantity of data that has no/negligible value. What does value mean to your organization, how do you measure it?
  3. Understand its risk. Different data sets have different levels of risk; personal information for example may carry higher levels of risks, depending on your jurisdiction. It’s important to map out the risk level per data set and determine the acceptable level of risks in your unique context.
  4. Profile and action. With the information assessed against a risk/value matrix, you’ll be able to better profile the information and then action accordingly. Get rid of the information that you don’t need and that could cause undue risk to the organization.

Once you have the profile completed, you’ll have a systematic set of rules against which you can analyze all your data shares and start to clean up your data corpus, disposing of data that isn’t required. This will save you storage space and budget, while making your team members’ lives easier. They won’t have to wade through pages of irrelevant and outdated search results, saving them valuable time and effort. It’s a rare, true win/win.  

Remember, as we automate processes it’s vital that the most accurate, up-to-date information is used within those processes. Having too much ROT can seriously hamper this goal and cause harm if decisions are made based on incorrect information.  

Reducing the amount of ROT you hold is the key principle of data minimization, because it’s the single best way to reduce ROT-specific data risk.

Introducing File Analysis

If you’re worried about the impact of ROT on your organization, but you lack the time and resources to conduct the review manually, RecordPoint could be the solution.  

As part of our commitment to high quality information governance outcomes, we developed File Analysis. File Analysis is a simple and efficient way to discover what's in your file share data, identify high value (and low value) data, and give you the intelligence to understand what you can do with it next. Empowered by your customized File Analysis report, your organization can make cost saving, risk reducing and quality migration decisions about that data. 

File Analysis ensures only the right information is retained, with ROT data clearly identified for easy deletion. 

Discover Connectors

View our expanded range of available Connectors, including popular SaaS platforms, such as Salesforce, Workday, Zendesk, SAP, and many more.

Explore the platform

Remove data you don't need

Avoid risk, manage data more easily, and cut costs by removing unnecessary data with RecordPoint Data Minimization.

Learn More
Share on Social Media

Assure your customers their data is safe with you

Protect your customers and your business with
the Data Trust Platform.