ML 101: How machine learning powers RecordPoint’s Classification Intelligence

Learn more about machine learning and AI, and how this technology powers modern records management solutions like RecordPoint’s Classification Intelligence.

Written by

Reviewed by

Share on Social Media

September 12, 2022

Finding it hard to keep up with this fast-paced industry?

Subscribe to FILED Newsletter.

Your monthly round-up of the latest news and views at the intersection of data privacy, data security, and governance.

Subscribe Now

Artificial Intelligence (AI) and Machine Learning (ML) are at the core of a modern records management solution like RecordPoint, allowing records managers and others to overcome the challenges that come from increased data volumes and organizations adopting a rapidly increasing number of data sources.

But these terms are increasingly deployed as buzzwords and marketing hype, handwavy ways to explain or promote anything from a simple statistical algorithm to a hugely impressive deep learning model.

In this blog post, we’ll learn a bit more about the history of ML and AI, the technology behind the RecordPoint solution, and how they power solutions such as Records365’s Classification Intelligence. Let’s start by defining these terms and looking at their origin.

How is Machine Learning different to Artificial Intelligence?

Although the two terms are often used interchangeably, AI and ML mean quite different things. Artificial Intelligence is the overarching field, founded in the 1950s, focused on making computers behave more like people.

Machine Learning is a sub-discipline of AI, and is focused on mathematics. ML involves teaching a statistical model or a neural network to classify or predict values based on examples of earlier data.

Like any technology field, AI/ML has gone through several cycles of boom and bust, corresponding with influxes and withdrawals of funding and innovation.

The history of AI & ML

In the late 1950s, there was a split in the AI discipline. Some researchers claimed it needed to be about probability and the mathematical models, and other engineers and researchers thought that it should be based in symbolic reasoning, where a user or researcher must set explicit rules for the system, which will then perform an action in response (such as classifying a record).

The rise of “Expert Systems”

In the early 1980s, due to computing limitations of the time, the symbolist, expert system approach came to dominate the AI field. But this approach has limitations. Systems built following this approach can be unforgiving and rigid: if your data does not exactly match those rules, the system will not be able to help. In addition there is the need to have an expert on hand who can build and manage the rules. Changes to your information architecture are also challenging, as models need to be rebuilt to match.

The AI winter of 1987-1993

As a result of this, there was a prolonged period in the 1980s and ‘90s where the promise of AI was not matched by the reality. In the 1980s, computer scientists were making bold promises for what AI and machine learning would be able to deliver, but there was not the computing power available, and there was not enough data to train up these big models.

Over time, technological advances made ML more feasible. In a virtuous cycle, technical successes in the field inevitably translated to more funding and more advances. These advances coincided with the cloud computing revolution, leading us to where we are today with ML occupying a central place in the technology landscape, and in SaaS solutions like RecordPoint.

The present day: AI & ML harmony

In the current landscape, AI and ML are complementary technologies. There’s a new emphasis on AI enrichment, or the integration of ML with symbolic AI approaches, to find new qualities that can then be used as the basis for rules. In the context of records, this means finding new metadata based on the record content itself.

There has been a renewed emphasis on making ML more transparent. Solutions will not reveal exactly how the algorithm works but will provide metrics that explain the performance of the algorithm, an estimate of the quality of its output, and details on the raw data used to feed it.

Such an example of a transparent, powerful machine learning can be seen in RecordPoint's platform. Keeping in mind the concepts we have discussed, let’s look at how we incorporate them into a highly intuitive system for records classification called Classification Intelligence.

What is Classification Intelligence?

Classification Intelligence (CI) is the RecordPoint ML engine, providing automated records classification with a high level of accuracy. CI is built to help customers classify and govern information at scale, considering the content and context of the information relating to the record. Rather than relying on manual effort to classify content and manage compliance, CI allows records managers and end-users to work as productively and collaboratively as possible.

How does it work?

Content is ingested into RecordPoint, from a variety of data sources thanks to our Connector suite. The platform then processes the content in three stages.

The first attempt of processing is reviewing the metadata, a static component of the content, usually set at its creation and rarely updated. And if the metadata is not sufficient, which is often true for data sources like Microsoft Exchange or Microsoft Teams, or the rules have not been configured in a way that allows metadata to be used, the platform will then throw the record over to the ML model.

This is where CI is used. At a high level, the process is as follows.

The model will first index the document’s text content.
Following this appraisal, the app will provide a prediction about the type of record the content corresponds to. This prediction is going to be directly correlated with your Business Classification Schema (BCS)/File Plan.
At this point, the records have been apprised, RecordPoint has generated a prediction.
The last step comes when the records manager reviews the policy suggestion and either accepts or corrects it.

How accurate can an ML model be?

Imagine you have hired a new person in your records team and they are not particularly familiar with your file plan just yet. You have asked them to classify a physical record, they have made a prediction and have now asked, “is this correct?”

At the start, there are going to be a few corrections required, but as they continue to do it, you are going to be able to trust them to correctly identify and classify records. It is the same with CI, which has the option to “auto-apply” classifications once you trust the quality of the model.

The difference is that CI is accurate, consistent and works at scale. Even the most well-trained, engaged users will miss things occasionally. We all have bad days, and classifying records isn’t their only task.

With AI/ML, the risk decreases that you will miss records. You will have highly accurate results, but you’ll get them for everything (thanks to our manage in-place model and Connectors), not just the records your employees decide is important.

Once the model is trained, you can trust it to classify data accurately, consistently, and at scale. This improves your organization’s compliance and improves employee efficiency.

Surely you need to be technical to use it, right?

This sounds like a technical process, but we’ve worked hard on making it simple for the user. You just need to be the Subject Matter Expert, understand the Information Architecture (i.e. what goes into each category, and retention and disposal rules). Then you just need to train the model as described above. Once you have trained the machine, the ROI is instant.