Phases to Implement Operational Suppression

By: RedLegg Blog

10/24/18 06:57 PM

The majority of alerts we receive from a customer SIEM are split between Security and Operational Alarms. Historically, we have processed these two types using the same workflow and then simply applied a separate SLA. As we roll out our new case management system and introduce new staffing models to scale our services more efficiently, we are modifying how we handle operational alarms. The service goal is to split in-scope and out-of-scope operational alarms for processing either by RedLegg or by the customer, respectively. This setup ensures that operational alarms are investigated by the most appropriate team and allows security alarms to be addressed by a team solely focused on handling security incidents.

To make this setup work for everyone, we will modify settings on operational alarms, route the alarms appropriately, and focus our attention on operational alarms that fall within the scope of our services. We will accomplish this in five phases, implementing a new tracking/case management system.

To scale security more efficiently, we are modifying how we handle operational alarms by splitting them between RedLegg and clients for processing.

Phase 1 – Operational Alarm Suppression

Operational alarms, like security alarms, are configured to automatically send alerts to RedLegg at regular intervals. However, not all operational alarms require notification at the same frequency. We have identified our most frequent operational alarms and will modify suppression settings to match the urgency of the issue, ranging from 30 minutes to 24 hours. These changes greatly reduce the amount of duplicate alarms we receive and allows us to better investigate and work with clients on fixing operational issues.

Phase-1-Operational-Alarm-Suppression

Phase 2 – Workflow Split

In order to split the workflow and ensure that the correct teams are handling the correct alarm types, RedLegg has created separate users with notification aliases on the client’s SIEM. Security alerts will be sent directly to our Watchtower security analysis platform, while operational alerts will continue to be sent directly to our ticketing system for processing. This housekeeping task simply routes the alerts properly. Security tickets will continue to be sent to our ticketing system as a safeguard for false negatives until all phases are in place. The dotted line in the diagram below represents this link: security tickets will be filtered out of active alert queues in our ticketing system.

Phase-2-Workflow-Split-2 Phase-3-Scope-Identification Phase 3 – Scope Identification

While our number one goal is to optimize your security posture, we also provide support when it comes to operational alarms. The amount of support we are able to provide is limited to the scope and accessibility of the resources; operational alarms from systems within the scope of our contract are our primary focus. Alarms for systems we cannot access can contribute to operational noise that may slow us when investigating legitimate issues. For this reason, Phase 3 seeks to limit RedLegg’s exposure to operational alarms from systems that are out of scope. Traditionally, RedLegg receives alarms for devices that are out of scope, confirms that they are out of scope, and then forward them to the client to for internal investigation. Beginning with Phase 3, those out-of-scope system alarms will be sent directly to the client to handle. This configuration will improve our focus on devices in scope, reduce the amount of alert routing, and greatly expedite getting out-of-scope alerts to the right people.

Phase-3-Scope-Identification-1

Phase 4 – Component Error Segmentation

Phase 4 includes housekeeping to improve alarm efficiency (tuning). Alarms for Component Successive/Excessive Errors/Warnings can be a bit ambiguous. Taking the currently configured Component Excessive/Successive alerts and breaking them into individual alarms that match more closely their intended purpose allows us to better tune suppressions on these alerts. Optimal tuning translates into more efficient ticket handling and more useful escalations. Part of this phase will also include splitting agent-based alarms depending on whether they originate from a collector or not, as collectors have a higher criticality.

Phase 5 – Remove Duplicate Security Alias

Once all of the new tuning in the previous phases is complete and the Watchtower platform is fully operational, we will remove the duplicate notifications in the ticketing system and the extraneous security filter. At this point, operational alarms will be fully tuned, and security alarms will be fully routed to Watchtower.

Phase-5-Remove-Duplicate-Security-Alias-1 Phase-2-Workflow-Split-1