RecordPoint Operations Records Management Cloud Service

Inside Records365 – Operating A Record Management Cloud Service

At RecordPoint we take the availability and security of our cloud offering very seriously. This article continues to explain the investments and key operational elements our teams use to keep your data secure, ensure your service is available, roll out updates, and make changes to the production environments hosting our cloud service.

RecordPoint Keeps Your Data Secure

One of the challenges of running a cloud-based software-as-a-service offering is keeping data secure and operating pre-production and production environments in a secure manner.

Records365 takes a multi-layered, defense-in-depth approach to security, leveraging some key features of Microsoft Azure, our underlying infrastructure-as-a-service platform.

Here is a breakdown of what we deploy to keep things running securely.

  • Endpoint Protection: This is one of the essential pieces of software deployed onto every virtual machine that we run today. Modern endpoint protection offerings differ from traditional anti-virus offerings through their tight integration with the underlying platform and their ability to quickly provide relevant notifications and alerts in a unified way. The main benefit of this is that we can view security alerts through a single pane of glass, and resolve them in place, without having to scramble through many logs, in many places.
  • Security Portal: As mentioned above, the Records365 Security Portal provides a central view of the health of our cloud estate across applications, networking, and infrastructure. Some of the key elements that we leverage and monitor are:
    • Intelligent threat detection and prevention: Azure is actively monitoring, aggregating and analyzing system logs from our infrastructure. State of the art machine learning and behavioral analytics are used to identify any potential attacks or exploits. Similarly, Azure continuously monitors network traffic coming from external and internal sources and analyzes this traffic to identify any suspicious activity or traffic.
    • Recommendations: Azure rolls all the information it gathers from our application, infrastructure, and networks into actionable recommendations, which we review on a regular basis to ensure that we are operating our cloud service as securely as possible.
  • Network Security: Network security groups provide us with the foundational capability to control network traffic that flows in and out of our production environments. Importantly, our Network Security platform provides DDoS and flooding protection for our cloud infrastructure.
  • Strong SSL Encryption: Sensitive and confidential information is encrypted while in transit via TLSv1.2. We are TLS by default, with all connections over a strongly encrypted secure channel.

Securely Operating Records365

RecordPoint also has developed a comprehensive set of procedures and policies to ensure that production environments undergo change in a controlled and secure manner. Firewall rules audits are performed at both the virtual machine level and the Network Security Group level on a regular basis. Any deviations from the documented configuration standard are reported and remediated in a timely manner. This ensures that the attack surface for Records365 remains as small as possible.

In addition to firewall rule audits, we perform regular access audits to ensure that only personnel with a legitimate need for production access have access. Finally, we perform regular patching of our cloud-based infrastructure to ensure that the latest security, application, and operating system patches are applied.

Monitoring the Records365 Cloud Service

Monitoring is a key element of operating a cloud service. As such, we have deployed several monitoring tools to ensure that our infrastructure is running within desired operational boundaries. Our infrastructure monitoring tool provides us with alerts for utilization or usage issues that may crop up on our platform from time-to-time.

Alerts generated are typically routed to our digital operations management platform, which ensures that our Site Reliability Engineering (SRE) team are notified in a timely manner when an incident occurs.

Our SRE team is geographically distributed all over the globe ensuring that Records365 is operating at desired service levels. Critical issues or maintenance tasks are handed off in a “follow-the-sun” model. This guarantees that issues and maintenance activities are completed in a timely manner preventing possible service outages.

This brings us to the topic of incidents and how we handle these at RecordPoint.

Whenever an incident is raised by either our monitoring systems or other channels, an on-call SRE engineer is notified who takes ownership immediately and begins to remediate and analyze the issue. Once the actual incident has been resolved, we then draft a post-incident report which is reviewed during a post-incident review meeting.

Our post-incident review process encourages organizational learning and continuous improvement through a blame-free environment. Members from the SRE team, as well as relevant stakeholders from our product, engineering and customer success teams, come together to review the incident and discuss ways in which we can mitigate the issue or prevent it from reoccurring. In a post-incident review we focus on:

  • Detailed timeline of events, including remediation steps, during the incident
  • Analysis of the root cause that triggered the incident
  • Resolution of incident
  • Impact of the incident
  • What went well
  • What didn’t go well
  • Action items such as bug fixes, process improvements or system enhancements

Again, the emphasis throughout this process is continuous learning so we can improve how we operate, monitor and manage the Records365 service.

Making Changes to Records365

Whenever any changes are proposed to the Records365 production infrastructure, these must be approved by the internal RecordPoint change control board. Our change control board meets on a weekly basis to review and approve changes.

The board has representatives from each relevant internal department to ensure that changes are reviewed thoroughly from an engineering, product management, support, and operations perspective. Changes require proof of thorough testing in our pre-production environments before they are approved for production rollout.

As an example, our upgrades to the Records365 service are tested multiple times in multiple geographies before we roll out the upgrade in production. Other engineering artifacts that are reviewed as part of the change control process are automated test results from our unit and integration test suites.

Once a change is approved for production rollout, a suitable deployment schedule is put in place that minimizes the impact of the change. An example of minimizing the impact of a change is SRE team members in one geography performing maintenance tasks at night time in another geography. Notifications are then sent to the change control board when a change is started and completed with the result of the change.

This type of communication keeps all internal stakeholders up-to-date on the status of recently approved changes that are being implemented in production.

At the end of the day, everything we have just talked about is to ensure that our Records365 customers get the best possible experience when using our cloud service. We are not just concerned with the already world class and first cloud-based records management experience, but also with how available, secure and confidential the service and associated data is.

And finally, we will soon be announcing our SOC 2 Type 2 attestation. This demonstrates our commitment to maintaining the strongest possible security posture and supports the trust our customers have placed in our service and organization.

You Might Like These Posts