Orange County 949-556-3131

San Diego 619-618-2211

Toll Free 855-203-6339

Six steps for building a robust incident response function

Introduction: This is the decade of incident response

Organizations globally realize that working only to prevent and detect cyberattacks will not protect them against cyber security threats. That is why IBM Resilient® was developed: to arm security teams with a platform for managing, coordinating, and streamlining incident response (IR) processes.

IBM Security has had the privilege of working with organizations of all sizes and across all industries as they implement Resilient solutions to develop more sophisticated and robust incident response functions. These organizations build IR processes that are consistent, repeatable, and measurable, rather than ad hoc. They make communication, coordination, and collaboration an organization-wide priority. They leverage technology that empowers the response team to do their job faster and more accurately.

But there are challenges to building and managing a more robust IR program. Three challenges in particular stand out:

  1. The volume of cyber security incidents is increasing Forty-two percent of cyber security professionals say their organization ignores a significant number of security alerts because they can’t keep up with the volume, according to Enterprise Strategy Group1.
  2. The volume of cyber security incidents is increasing Forty-two percent of cyber security professionals say their organization ignores a significant number of security alerts because they can’t keep up with the volume, according to Enterprise Strategy Group1.
  3. Organizations are too complex and underprepared for effective response Insufficient planning and preparedness and complexity of IT and business processes are the top barriers to responding to cyberattacks3.

To solve these challenges, many IBM Resilient customers are striving to align their people, process, and technology so that IR analysts understand who is responsible for which tasks, when tasks need to be done, and how to do them. This emerging concept is known as incident response orchestration.

Incident response orchestration empowers security analysts by putting IR processes and tools right at their fingertips. They can access important incident information in an instant, make accurate decisions, and take decisive action. It leverages automation to increase the productivity of security analysts and technologies—alleviating the skills gap and the volume of alerts.

But IR orchestration is a process, not a product. It requires strong foundational blocks—trained people, proven processes, and integrated technologies. Orchestration is built on these core elements, and the effectiveness of an organization’s orchestration efforts lies entirely on the quality of these fundamental pieces.

Mapping your IR maturity

Over the years, IBM Resilient customers have increased their IR sophistication at various levels across a spectrum of maturity. Maturity levels are often necessitated by industry, available resources, or experience, but most IBM Resilient customers continually look to evolve their IR function into a more advanced phase.

This model maps the journey from an ad hoc and insufficient incident response function to one that is fully coordinated, integrated, and primed for continuous improvement and optimization.

The road to orchestrated incident response starts with developing people, process, and technology. That is the purpose of this guide: to show you the primary key steps in the process of building a robust IR function.

Step 1: Understand threats, both external and internal

Every organization faces a unique threat landscape, and the first step in building out your incident response function is to develop a detailed understanding of this landscape.

Part of your threat landscape is the nature of the cyberattacks your organization will contend with. That may include specific threats that your organization has addressed in the past (for example, malware infections or phishing attacks), as well as threats that are known to affect your industry broadly (such as ransomware attacks on healthcare organizations, or DDoS attacks on internet infrastructure companies).

Additionally, a robust threat model should consider all possible actors and incidents. For example, a recent survey of a dozen healthcare organizations found that many struggle with an “inadequate threat model” and focus “almost exclusively on the protection of patient health records.” 4 The survey found that rather than developing a holistic view of their IT environment and possible threats, staff at healthcare organizations rarely venture beyond the narrow focus of regulations like the US HIPAA law. More serious threats that didn’t directly affect patient health information—such as ransomware that targets healthcare devices—lurked in organizational blind spots.

The spectrum of possible cyber incidents your organization may face is broad, and each will warrant its own IR process. To get started, among the questions you might ask are

  • What kinds of attacks or adverse incidents has our organization experienced in the past?
  • Have we sustained a malware infection in the recent past? If so, what kind of malware (botnet, theft of data, ransom)? When and for how long did the incident last and how was it resolved?
  • Have our employees been the victims of targeted phishing email scams designed to steal employee credentials? If so, which employees?
  • Has our organization been the subject of criticism in popular online forums or by hacktivist groups or other online personalities?
  • Has our organization been specifically targeted by a denial-of-service attack or other form of intentional online disruption?

In attempting to understand the threats facing your organization, consider what types of attacks your competitors, business partners, and peer companies have encountered. Have you seen similar attacks?

Preparing for privacy breaches

While cyberattacks themselves can be enormously damaging, the potential for regulatory fines can be equally if not more damaging to an organization. It’s essential for security teams to assess what regulations will apply to them in the event of a breach-based on your industry and the data you hold that may be targeted— and how they can be best prepared to ensure compliance. Questions to ask include:

  • What are your privacy obligations—including industry regulations, state/federal data breach laws, and contractual agreements?
  • When do you need to provide notification of privacy breaches (factors often include breach size and whether the data was encrypted—but vary across geographies and industries)?
  • Who needs to be notified, and how (customers, attorney general’s office, others)?
  • What is the time limit for notification?

Privacy obligations are already a major concern for security and privacy professionals, and it’s likely to increase with the EU’s incoming General Data Protection Regulation, or GDPR, which goes into effect in May 2018.

The GDPR is a globally focused privacy law that introduces steep, sweeping changes. It applies to any organization globally that does business with EU citizens or organizations, includes a 72-hour window for data breach notification (which is much tighter than most current laws in the US), and can impose potentially enormous fines for non-compliance (20 million euros, or four percent of an organization’s annual revenue). Organizations should take steps and set roles, responsibilities, and processes for complying with GDPR now.

Assessing your organization

Additionally, your threat landscape is not just the external factors and risks that may impact you, but also your internal challenges and shortcomings. As described earlier, the cybersecurity skills gap looks to be a challenge that our industry will need to manage for the foreseeable future—and organizations should assess how it impacts them today and work to manage it.

To identify your internal skills gap, evaluate the current skills you have versus the skills you’ll need to effectively combat and manage the external threats you face. Performance metrics such as time-to-completion on individual tasks and workload balance are good indictors of the skills you have today and where the gaps are. And by using tabletop exercises and analysis, you can further validate your assessment and find additional gaps you may have overlooked.

Finally, your threat landscape—the attacks you face, the regulations you’re beholden to, and your organizational skills shortage—is a continually evolving assessment. As the cybercrime market, privacy regulations, and other industry trends shift, your landscape will too. Be sure to set regular intervals to review and update your threat landscape accordingly.

Case Study: Top 10 European Bank

One IBM Resilient customer faced a unique challenge: they had three security teams around the world who managed incidents with their own specific processes. This led to valuable threat information becoming siloed, a lack of central management and oversight, and no dependable way to test and improve IR processes.

The organization’s security leadership knew it had to standardize IR plans across the organization and enable centralized incident management and oversight.

The plan: the security leadership team brought the groups together to collectively develop combined, standardized response plans for specific incident types—incorporating the most effective and proven processes from across the three groups. Additionally, the organization implemented a single incident response platform (IRP) for the three groups to:

  • Centrally manage incidents across the organization
  • Enable better context gathering and collaboration
  • Provide better visibility to management
  • Create a feedback loop that ensures new IR plans, tests, and improvements are shared across the organization

With this new strategy, the organization’s security teams can continually gain value from the organization’s experience and intelligence collectively.

Step 2: Build a standardized, documented, and repeatable incident response plan

Surveys indicate that insufficient planning and preparedness is still the single biggest barrier to cyber resilience today. It is, perhaps, not surprising then that most organizations don’t have a proper incident response plan in place. According to the 2016 Cyber Resilient Organization study from the Ponemon Institute, only 25 percent of organizations have a cyber security incident response plan (CSIRP) in place and applied consistently across the organization. The remaining 75 percent either don’t have a plan at all, follow informal, ad hoc processes, or don’t have their plan applied across the organization.

As a result, many IR functions are slow, inefficient, and ineffective–which increases the likelihood of a costly, damaging cyberattack, increases employee dissatisfaction and burnout, and puts security leadership’s jobs at risk. However, having a standardized, documented, and repeatable IR plan addresses these risks and ensures your team knows exactly what to do, and when and how to do it. It also provides a platform for continual improvement, enabling your organization to stay ahead of ever-evolving cyber threats.

The challenge: creating a proper IR plan is time-consuming and requires a dedicated, organization-wide effort. To that end, security leadership needs to work to make incident planning a priority. An incident response planning workshop can ensure that all your team’s stakeholders come together to develop consistent, documented, and standardized response plans.

Your team should engage with executives and even the board of directors to ensure they understand the risks and let other relevant leaders know that they’ll be expected to contribute. This includes marketing, HR, legal, IT, and other business units.

During the workshop, your teams (with security leadership’s guidance) can come together to walk through specific incident scenarios and:

  • Map out specific steps that need to be taken to resolve an incident throughout its lifecycle.
  • Determine roles and responsibilities
  • Identify the key technologies and channels of communications to be leveraged during a response
  • Build processes around permissions and escalations

Resources like NIST, SANS, and CERT can provide great frameworks for these conversations and plans—but, ultimately, your IR plans will need to be specific to your organization. Therefore, it’s important to involve all contributors across the organization. You will need to tap the know-how and experience of your existing IT and security teams, key stakeholders within your organization, as well as executives, and legal and compliance officers. External third party entities like business partners and suppliers can also be part of the conversation.

By the end of these exercises and conversations, your team should have well-thought-out, repeatable, and documented plans that can be centralized, followed by anyone on your team, and continually improved upon over time.

Case Study: Fortune 100 Technology Company

One IBM Resilient customer had made major technological investments in their SOC, and needed to ensure their people and processes were equally developed. Their plan: use simulations to test processes and develop SLAs and executive reporting.

This customer established regular, quarterly simulations that focused specifically on complex and unlikely events—ensuring they wouldn’t be caught off-guard by most severe threats. To gain organizational support, the security leadership developed incident response SLAs. These metrics were grouped by incident types and severity, and provided a standard for the incident response team to strive for. Additionally, the SLAs enabled the CISO to demonstrate performance to the board —and set budget accordingly. Today, this customer continues to experience hundreds of incidents daily—but their well- trained team can manage and resolve them in a streamlined, effective manner.

Step 3: Proactively test and improve IR processes

Cyber adversaries are continually striving to gain new advantages. Cyber security teams need to make staying ahead a priority.

One of the most effective ways to keep IR capabilities driving forward is running simulations—and doing them in a dedicated, results-driven manner.

IR simulations provide a useful method for overcoming the “insufficient planning and preparation” barrier. Simulations ensure that your entire IR function—people, processes, and technology—are primed and ready for real-world incidents, while also uncovering opportunities for future improvements.

The key for security leaders is to ensure that their simulations are effective, and there are specific steps your team can take to ensure your team is making improvements and making them stick.

To start, security leaders should plan upfront to make the simulation meaningful. Do you want to practice a commonly seen incident, or prepare for something unexpected? Both types are valid to explore.

Security leaders should also build specific, thoughtful simulations that include important details your analysts will need to search for. In other words, make your team think critically about the simulation and ensure it’s more than just a check-the-box exercise.

Additionally, make your simulations measurable. Set goals and track key metrics such as time-to-completion and level of completeness. And replay simulations to measure improvements (or regressions).

Finally, make IR simulations an organization-wide event. Include participants from HR, legal, marketing, and other groups to ensure they will be ready to play their parts when a real incident hits. Similarly, share the results of your post- mortem analysis across the organization. This will help keep your team honest and educate leadership on where and what resources are needed.

Step 4: Leverage threat intelligence

Cyber criminals are working together—collaborating and sharing information across the dark web. Security professionals should be working together, too.

As part of the 2016 Cyber Resilient Organization study, the Ponemon Institute compared high-performing respondents (those whose cyber resilience had increased in the last year) to average organizations to identify key differences. One of the many findings: high-performing organizations are more likely to participate in a threat-sharing program (70 percent versus only 53 percent of average organizations).

The threat intelligence (TI) industry has seen increasing buzz in recent years, and for good reason: security teams are seeking better insight and awareness into the activity in their environments.

Leveraging threat intelligence is a big part of becoming more aware. But there are challenges to implementing it. Security teams often need to navigate countless feeds of varying quality, as well as manage the signal-to-noise problem.

Fortunately, many IBM Resilient customers have years of experience implementing and experimenting with a variety of threat intelligence feeds. Based on their combined experiences, here are three key ways to effectively leverage TI for better incident response:

  • Anchor threat intelligence in incident response plans One IBM Resilient customer, a major media network, found their analysts spent far too much time investigating threat intelligence data. They were chasing issues that didn’t apply to them, which drained resources and severely limited their effectiveness. To fix this, the team grounded threat intelligence data into their existing incident response processes. Analysts escalate indicators of compromise (IoCs) into incidents, and they can access vital information about potential threats when needed—using the available intel when relevant to the circumstances they face. This led to huge improvements in time management and team effectiveness.
  • Use integrations and correlation to make threat intelligence actionable By integrating threat intelligence with other data sources like SIEMs and EDR tools, analysts can gain fuller incident context and the information becomes more actionable. They can refine and target the scope of the data by considering the context, severity, and patterns. This helps analysts better understand what they’re contending with and what would be best to do about it.
  • Track and measure the usefulness of your sources There are plenty of intel feeds and none are one-size-fits-all. Examples include open source, closed communities, commercial sources—and then there’s the threat intelligence platforms. Record how often individual feeds provide information, and the quality and how critical the information provided is. You’ll soon discover if certain feeds are redundant or need to be adjusted in any way.

As we’ll explore further in upcoming sections, incident response platforms (IRPs) can automate much of the manual portions of cyber incident investigation and response. Among other improvements, IRPs use data analysis and specialized logic in an approach called artifact visualization. This allows you to see how seemingly disparate incidents might be related by noting the commonalities between them—such as IT assets involved, malicious software used, malicious infrastructure communicated with, and so on.

Organizations that can identify incidents and grasp the disparate artifacts that make up the story of a breach will drive down response times from days or weeks to hours. This also helps to implement practical controls in areas like user access, data security, and communications that will prevent future incidents from occurring.

Step 5. Streamline incident investigation and response

As noted in the Verizon Data Breach Investigation Report, fewer than a quarter of all incidents Verizon reviewed were detected in “days or less,” while the majority took days, weeks, or months to detect5 . With cyber incidents lasting undetected for weeks or months, malicious actors have the opportunity to establish a beachhead on compromised networks that can be difficult to remove.

One reason is that most organizations rely on ad hoc processes for investigating even straight-forward cyber incidents like phishing attacks on employees—and because of the skills gap, organizations who have the right tools and technology may struggle to find enough resources to efficiently manage the deluge of incidents.

As organizations add integrated data and threat intelligence sources to their IR processes, the opportunities to orchestrate responses in a sophisticated way grows—starting with the automation of low-level tasks.

Automation is a useful method of streaming menial, repetitive tasks, and making your team faster and smarter. When used in a broader incident response orchestration strategy (learn more about orchestration in the next section), automation can empower your team to be strategic decision makers.

In the case of an outbreak of malware, for example, a suspicious sample detected on one endpoint can be automatically grabbed and fed to an endpoint agent or next-generation threat detection platform to observe and classify. Based on the outcome of that analysis, further automated and manual processes can be queued up: identifying other infected hosts on the network and requesting permission to quarantine them, identifying a vulnerability associated with that malware infection and scheduling emergency patches to vulnerable systems, or firing off requisite notifications to internal staff or external monitors, for example. And, at each stage, requests, responses, and actions can be documented for future reference.

To begin with automation, pinpoint the right processes to streamline. These are often time-consuming, menial, and inefficient tasks that take up inordinate amounts of analysts’ time, and can be safely and reliably automated. Security leaders should also analyze the risk and complexities of automating a process versus the potential efficiencies gained.

To ensure safe and reliable automation, test the processes’ fidelity. Script manual actions that keep human decision- making and approval involved. Once your team builds a comfort level to know that the process is right and the technology works properly, you can decide to fully automate.

However, it’s important to note that while technology-based automation can save time, it’s only as strong as your overall IR function—and is most effectively leveraged in an orchestrated incident response strategy.

Step 6. Orchestrate across people, process, and technology

The promise of incident response orchestration—making response faster and more automated—has drawn the attention and interest of many security experts across the industry. But as referenced in the last section, successful and effective orchestration and automation requires a strong overall IR function. The key to effective orchestration lies entirely on the quality of an organization’s IR fundamentals: people, process, and technology.

The earlier sections of this guide have been created to help you ensure these fundamental building blocks are well- thought-out, strong, and primed for future improvements. To refresh, here are essential questions to ask when assessing the strength of your IR foundation:

  • People: Have you ensured your IR team is well-coordinated and well-trained? Do they have the right skills to address all aspects of an incident’s lifecycle? Do they have means for collaboration and analysis?
  • Process: Do you have well-defined, repeatable, and consistent IR plans in place? Are they easy to update and refine? Are you regularly testing and measuring them?
  • Technology: Does your technology provide valuable insight and intelligence in a directed fashion? Does it enable your team to make smart decisions and quickly act on those decisions?

By addressing these questions, you can ensure your orchestration efforts will align these building blocks with real effect. If you haven’t developed this foundation, the benefits of orchestration will be marginal.

The goal of incident response orchestration is to empower your response team by ensuring the humans in the loop know exactly what to do when a security incident strikes, and have the processes and tools they need to act quickly, effectively, and correctly.

Orchestration and automation are both growing in popularity among cyber security professionals, but orchestration is different in that it supports and optimizes the human-centric elements of cyber security—like helping to understand context and decision making—and empowers them as central to security operations.

This is a critical distinction because security threats are uncertain problems. Responding to a threat is hardly ever a cut-and-dried issue. Automation is a great tool for quickly and effectively executing specific tasks—but since threats are often evolving and adversaries are changing tactics, human decision-making is needed to step in for things like escalating issues or troubleshooting.

Automation is an effective tool in the broader orchestration process, but it’s the human element that makes orchestration the game-changer that it is.

Orchestration applies differently to each specific organization. It should map to your unique threat landscape, IT and security environments, and company priorities. But for a quick example, the following is a classic use case of how we see orchestration employed in many of the organizations we work with.

In the top left of the graphic, you can see that as an incident is escalated from a SIEM alert, a record is automatically created in the organization’s incident response platform (IRP). From there, in the bottom right, the platform automatically gathers and delivers valuable incident context from the built-in threat intelligence feeds and additional sources.From here, the security analysts already have critical information when they step in and take control. These analysts can leverage additional integrations to manually take on additional tasks deemed necessary—including gathering additional information about an incident from other security tools (such as endpoint security tools or web gateways) or starting to remediate the issue by alerting the IT help desk or going to the identity management to pull users off the network.

There are many different ways to orchestrate IR processes, but the goal is always the same: put your analysts in the best position to respond to threats.

As incident response processes mature, organizations enter a phase of proactive response, in which information gleaned from incident response becomes strategic to an organization. With proactive response, intelligence from the IR team can be fed back into a security and IT organization — shaping technology investments and acquisitions, sharpening employee skill sets, and broadening an organization’s understanding of risk to encompass a broader ecosystem of physical security assets and providers, threat intelligence providers, regulators and government agencies, and more.

While few companies — even within the Fortune 500 — have achieved this level of maturity, we expect the strategic application of incident response to become more common as more firms migrate to mature incident response platforms in the coming years.

Conclusion: Building a resilient, response-ready organization

It is tempting to imagine that technology advancements will soon turn incident response into a push button function that can be performed by even junior employees. The truth is that IR is, and will be, complicated and multifaceted and will require the attention of intelligent security analysts.

Mature incident response combines people, processes, and technology as part of a continuum. The job of technology isn’t to replace human analysts, but to empower them to do more: delivering better intelligence about specific threats, streamlining response processes, and making sure that security analysts are ready to respond.

Additionally, a mature cyber security incident response function can beget a larger, cultural transformation within your organization: integrating your security team more closely with IT operations and management, and enlisting them in the process of responding to cyber incidents in a comprehensive way.

As incident response processes mature, organizations enter a phase of proactive response, in which information gleaned from incident response becomes strategic to an organization. With proactive response, intelligence from the IR team can be fed back into a security and IT organization — shaping technology investments and acquisitions, sharpening employee skill sets, and broadening an organization’s understanding of risk to encompass a broader ecosystem of physical security assets and providers, threat intelligence providers, regulators and government agencies, and more.

While few companies — even within the Fortune 500 — have achieved this level of maturity, we expect the strategic application of incident response to become more common as more firms migrate to mature incident response platforms in the coming years.

The Total Economic Impact Of IBM Resilient

Executive Summary

IBM provides a security incident response (IR) solution called Resilient that helps its customers address security incidents quickly in an automated and orchestrated manner. IBM commissioned Forrester Consulting to conduct a Total Economic ImpactTM (TEI) study and examine the potential return on investment (ROI) enterprises may realize by deploying Resilient. The purpose of this study is to provide readers with a framework to evaluate the potential financial impact of the Resilient platform on their organizations.

To better understand the benefits, costs, and risks associated with this investment, Forrester interviewed a Resilient customer with several years of experience using the solution. Forrester found that, as an incident response platform, the solution provides significant benefits by shortening the response time for security incidents through the enablement of automation and orchestration to security professionals — effectively shortening the time-to-contain security incidents. Security tools and devices across the enterprise are more frequently put into play sooner with dynamic playbooks that cut analysis and triage times required by incident responders.

Prior to using Resilient, the interviewed customer leveraged a ticketing system that provided little in the way of automation. This system yielded limited success, leaving the customer with little intelligence due to a lack of integration to the security tool stack. These limitations led to the need for a significant army of security professionals who needed to be specialized in a wide variety of security areas to be able to identify and contain threats.

Key Findings

Quantified benefits. The interviewed organization experienced the following risk-adjusted present value (PV) quantified benefits:

  • Orchestration and automation saved 25 minutes per security analyst and over an hour in total per security incident. With over 350 cybersecurity incidents per week, the interviewed organization was saving nearly 22,750 hours of security analyst man-hours in the first year. Accounting for the rise in cybersecurity incidents over the years and the relative high cost of security analysts, this translated to a three- year savings worth $4.5 million in labor costs. The reduction in effort by the security analysts to handle incidents resulted in increased time for them to perform advanced analysis of threats and develop new countermeasures to further improve the organization’s security posture.
  • End users benefited from quicker incident response and improved uptime. While the Resilient platform did not offer direct improvement on the detection of incidents, it did allow incident responders to contain threats much more quickly after the initial detection. On a per incident basis, business users saved half an hour due to the reduction in time-to- contain as they no longer needed to wait as long for security analysts to investigate and perform remediation steps. Additionally, the quicker time- to-contain led to avoided image restores and wider scale remediation action on the endpoints. In all, the organization saved between 11,830 and 15,645 hours per year with the Resilient platform.
  • Resilient, as the incident response platform, brought visibility to the efficacy of existing security tools, enabling security professionals to realize the full potential of the organization’s library of tools. With Resilient acting as the central dashboard orchestrating the response to security incidents, security professionals were able to centrally collect data and determine points in the security architecture that were less responsive. With the insight, security professionals could identify the exact point of failure and choose to either reconfigure the tool or substitute the tool with a more effective replacement. Security tools are expensive investments, and Resilient helps professionals reaffirm that these investments are working as advertised.

Unquantified benefits. The interviewed organization experienced the following benefits, which are not quantified for this study:

  • Resilient provides instant dashboarding to help expedite the audit process and reduce scrutiny from regulatory bodies. Most enterprises are audited on the security front numerous times a year and provide management reporting on security incidents at an even higher frequency. By being able to centralize security response data in the Resilient platform, the interviewed organization can provide internal and external auditors with data that reduces security professional effort and auditor effort.
  • The organization saw continual security posture improvement from newly free time to security analysts. Whereas the interviewed organization was once constantly fighting fires, it is now doing deep analysis into threats to continually improve its processes and defenses. The value of this has not been calculated, but it certainly helps the organization’s security individuals sleep better at night knowing that they are in a better posture to prevent massive fallout from situations like recent, widely publicized security breaches.

Costs. The interviewed organization experienced the following risk- adjusted costs:

  • License and support costs amounted to $3,469,440 over three years. The license costs are both user licenses for the incident responders as well as the primary software licenses for production and development environments. Standard support and service has also been accounted for in this category.
  • Software integration and process build outs are a low but ongoing cost. This cost category is inclusive of deployment, orchestration buildouts, and integration build outs with existing security tools. Some APIs are included, but as the interviewed organization’s security architecture was complex and tools are numerous, the custom buildout of these integrations was necessary and cost $266,745 over three years.

Forrester’s interview with an existing customer and subsequent financial analysis found that the interviewed organization experienced PV benefits of $7,610,015 over three years versus PV costs of $3,736,185, adding up to a net present value (NPV) of $3,873,830 and an ROI of 104%.

TEI Framework And Methodology

From the information provided in the interview, Forrester has constructed a Total Economic ImpactTM (TEI) framework for those organizations considering implementing IBM Resilient.

The objective of the framework is to identify the cost, benefit, flexibility, and risk factors that affect the investment decision. Forrester took a multistep approach to evaluate the impact that IBM Resilient can have on an organization:

  • DUE DILIGENCE: Interviewed IBM stakeholders and Forrester analysts to gather data relative to Resilient
  • .CUSTOMER INTERVIEW: Interviewed one organization using Resilient to obtain data with respect to costs, benefits, and risks.
  • FINANCIAL MODEL FRAMEWORK: Constructed a financial model representative of the interview using the TEI methodology and risk-adjusted the financial model based on issues and concerns of the interviewed organization.
  • CASE STUDY: Employed four fundamental elements of TEI in modeling IBM Resilient’s impact: benefits, costs, flexibility, and risks. Given the increasing sophistication that enterprises have regarding ROI analyses related to IT investments, Forrester’s TEI methodology serves to provide a complete picture of the total economic impact of purchase decisions. Please see Appendix A for additional information on the TEI methodology.


Readers should be aware of the following:

This study is commissioned by IBM and delivered by Forrester Consulting. It is not meant to be used as a competitive analysis.

Forrester makes no assumptions as to the potential ROI that other organizations will receive. Forrester strongly advises that readers use their own estimates within the framework provided in the report to determine the appropriateness of an investment in IBM Resilient.

IBM reviewed and provided feedback to Forrester, but Forrester maintains editorial control over the study and its findings and does not accept changes to the study that contradict Forrester’s findings or obscure the meaning of the study.

IBM provided the customer names for the interviews but did not participate in the interviews.

The Resilient Customer Journey


Interviewed Organization

For this study, Forrester interviewed an IBM Resilient customer with multiple years of experience using the platform:

  • This is a financial services organization with a worldwide footprint.
  • It employs more than 15,000 full-time equivalents (FTEs) and has revenues in the tens of billions.
  • It has a cyber defense team of approximately 150 security professionals.
  • This is an organization that is held accountable to multiple regulatory bodies; effective security posture and processes are instrumental to meeting the standards.

Key Challenges

Coming from an existing state of using a homebrew incident response plan that incorporated the use of an IT ticketing system, the security team at the organization felt that its needs were largely unmet. There was a clear lack of visibility and integration into various security tools, providing for weak documentation and a complete absence of automation. “We had a clear desire for so much more to improve our efficiency, and when we realized that the existing solution failed at 99% of our wants, it was time to move on,” said the VP of cyber defense. Further, “The messaging from the top was that we had these solutions already — but our own analysis suggested it [the old solution] was clearly incapable of doing what we needed to be effective.”

  • There was a lack of integration with various security tools: Lacking integration with the security stack resulted in very little documentation and metrics for consumption. Further still, the effort required to triage and actually drive to the root cause of the incidents was largely manual and time-consuming. The old system served as a way to mark issues but aided very little in actually feeding information to security professionals so that they could take proper action on containment.
  • A lack of playbooks meant that every situation was assessed manually when it could have been automated: Incidents arose in a variety of forms and attack vectors. Incident responders would manually go through the analysis process, pulling information from various tools to determine the proper course of action. Said simply, the security analysts needed to enact different containment processes on every incident. The result was that different analysts performed containment and remediation in different ways, piling up on the inefficiencies.
  • There was a clear disconnect on automation and orchestration. Without integration, there was no automation. No single centralized point of control was dictating the hundreds of remedial actions that had previously been seen. Again, these actions took manual labor, and remediation was left to the wildly varying methods between the incident responders.
  • Security professionals were a scarce commodity: Being in a constant firefight mode required a large force of incident responders who were each versed in a wide variety of security elements. As the need for these professionals grew, it was more and more costly to add to this cyber defense group. Intelligent automation was a clear solution to reduce the laborious effort of analysis and containment.

Decision To Use Resilient

After an extensive request for proposal (RFP) and business case process evaluating multiple vendors, the interviewed organization chose Resilient and began deployment:

  • The organization chose Resilient because of its dynamic playbooks — the ability to follow the path of incidents and act dynamically through the stages of breach from initial identification to internal network proliferation and widespread data corruption.
  • By the end of the bake-off proof of concept (POC), the organization had built simple integration that translated into significant automation savings for all incident responders.
  • The Resilient solution was running and integrated with many of the organization’s mission-critical security tools within two weeks.

Key Results

The interview revealed that key results from the Resilient investment include:

  • Integration with existing security tool sets allowed for a dramatic automation improvement. By integrating with existing tools, Resilient took initiative to present the relevant data on issues to security analysts and then completed the required actions through the tools once approved by analysts. In short, orchestration and automation eliminated a large portion of investigative work from the detect, analyze, contain, and eradicate workflow.
  • Like security practitioners, business end users found greater productivity. As the time-to-contain shortened from automation, business users enjoyed higher levels of uptime at their workstations, directly feeding value back to the organization in productive output. Disruptions were reduced in scale; even IT help desk effort was reduced as reimage sessions or virtual machine (VM) recomposes were minimized by fast action to resolve incidents.
  • Having a capable incident response platform was the final piece of the security puzzle to tackle increasingly complex attacks. While detection and remediation were still largely left to the existing tools in the security group, the time to take action and contain threats had dramatically improved. Being without an IR platform capable of orchestration was like having the tools but having to wait to decide when and where to use which specific tool for the task.

Financial Analysis

Orchestration And Automation Savings For Incident Response

Following the deployment of IBM Resilient, the interviewed organization realized a significant gain in the automation and, in turn, a reduction of security analyst effort. Whereas the existing solution offered very limited or no data from the relevant security pieces, Resilient, once integrated with the security stack, was able to provide vivid detail on security incidents and enact on containment and remediation actions with minimal input from security personnel. From the interview, Forrester determined:

  • Security experts can see from a centralized command center the initial point of detection and any further exploitation caused by the incident. Using dynamic playbooks, the Resilient platform visibly displays the actions required and can execute with a single click from the incident responders.
  • In the previous state where incidents morphed and affected multiple points across the network, incident responders would rely on multiple analysts to determine and contain these threats. With Resilient, the organization can identify these threats and reduce the number of actual personnel necessary to mitigate the issues.
  • The interviewee stated that the longest part of the incident response workflow was the analysis and triage on the incidents. Resilient effectively reduced the effort involved by over 80%. Accounting for three analysts who may have been involved in these incidents, their individual effort was reduced by nearly 25 minutes for analysis, resulting in a total of 1.25 hours saved per incident.
  • At an average rate of 350 incidents occurring on a weekly basis, we estimate that 22,750 hours were saved in the initial year by the security responders and analysts.

Calculations have been adjusted for an increase in efficiency through optimization of orchestrations and an increase in incidents that will occur over the ensuing years. Forrester estimates security incidents to increase by nearly 15% at financial services organizations on a year- over-year basis.

  • Incidents will grow in frequency by 15% year over year.
  • Tuning and optimization of the orchestration through further integration with security tools will increase the time saved by security professionals by 10% year over year.
  • At a rate of $110,000 per year, accounting for benefits, security professionals earn the equivalent of $66/hour.

With the time saved, the security analysts were not necessarily relinquished — especially as they are highly sought after. Instead, the interviewed organization allocated these analysts to spend the newly found time saved from automation to perform deep level analysis — such as determining the advanced behavior of malware or optimizing rule sets and orchestration so that incidents are handled even faster in the future.

While Forrester believes the value of automation and orchestration to be undeniable, readers should be aware of the potential impact risk of exacting the benefits if an established IR plan is already in place. Consideration of this risk should be for organizations that may already be very mature in incident response and security posture — factors that may diminish the value cited in this category.

To account for this risk, Forrester adjusted this benefit downward by 5%, yielding a three-year risk-adjusted total PV of $4,502,964

End User Productivity Recapture From Improved IR Capabilities

IBM Resilient does not enable quicker detection of malicious activity — this is a function of the existing security infrastructure. Likewise, Resilient does not perform the remediation. Instead, the Resilient solution accelerates the incident response workflow once an incident has been detected, leading to a significantly reduced time to enact the remediation and containment procedures — otherwise explained as the period of time between mean-time-to-detect (MTTD) and mean-time-to-contain (MTTC).

Incident responders previously required between 20 and 30 minutes to analyze and determine a proper containment approach, which has largely been eliminated due to Resilient’s automation and orchestration to carry out containment measures. End users who operate on the enterprise network would often find that their machines were locked out upon detection, resulting in a period of downtime until the endpoint was contained and remediated. With the reduction in the period between MTTD and MTTC, the end users are able to recover this time to be spent productively.

Additionally, a number of incidents often cause deeper and collateral damage as time passes. A hastened action to respond to the incident often reduces the need for deeper-level remediation/recovery techniques, such as a complete reimage.

For the interviewed organization, Forrester found that:

  • For each of the incidents occurring across the enterprise, end users are saving a minimum of 30 minutes per incident. They are repurposing those 30 minutes into productive output.
  • The percentage of incidents that ultimate may have necessitated full restores or VM recompose without a hastened response is 10%.
  • The average time for reimage, recompose, or full remediation is estimated at 1.5 hours — time that would have been taken away from user productivity.

The reduction in productivity recaptured can vary depending on:

  • The number of applications installed on the endpoint stations.
  • The time needed to reimage/recompose.
  • The detection efficacy of security measures already in place.

To account for these risks, Forrester adjusted this benefit downward by 5%, yielding a three-year risk-adjusted total PV of $1,346,720.

Existing Security Asset Value Realization Improvement

Enterprises today are rightfully concerned about their security posture and allocate increasing amounts to security budgets — especially given the number of high-profile breaches that frequent the news. With an assortment of tools, how do organizations determine the efficacy of these individual tools following POC and deployment? Forrester’s interview with the customer organization revealed that while POCs and bake-offs can be useful for a first impression, sometimes the solutions are not quite as effective as originally expected. With Resilient as a central point of orchestration and data collection, the interviewed organization gained visibility into its collective security stack and was better able to evaluate its existing investments.

  • Upon integration with Resilient, the security team was able to collect information as to which defense mechanisms were more effective, if effective at all, on detection or containment of malicious activity.
  • The interviewed organization was able to clearly delineate whether its browser sandboxing and database access management were working, as results were all reported back to Resilient.
  • The organization estimated that over $1.5 million of its investments were not properly configured or working to the standard promised, resulting in either reconfiguration or removal of those services.
  • Detection was primarily noted in the first year of Resilient deployment with smaller incremental gains in the years after.
  • Recognition of the points of failure saved an additional amount of labor, when the security team would have passed false negatives or exerted additional effort on false positives.

Value recovered from the points of failure was estimated at a PV of $1,760,331 over the course of three years of usage.

Unquantified Benefits

Beyond the quantified benefits represented above, the customer organization identified the dramatically improved security posture now present. Time previously spent on remediating incidents is now spent to do advanced heuristics on malware and threats — understanding the underlying nature to prevent additional outbreaks in the present and future. What is the value in that? Forrester has determined the following on breaches:

  • No organization is immune to breaches. The size of an organization cannot determine the likelihood of attack or the accompanying potential damage, nor can any particular industry preclude an organization, as motivations behind breaches have evolved. While it is impossible to say what percentage of organizations are breached, we know that it is a matter of when, rather than if. Organizations that have a solid IR plan and perform deep-level analysis on different threat vectors stand a much-improved chance of minimizing damage.
  • Breaches that are not addressed with immediacy contain a number of financial ramifications in the short and long run. Lost revenues, legal settlements, regulatory fines from the likes of the Federal Financial Institutions Examination Council (FFIEC) and the Payment Card Industry (PCI), and long-term brand erosion should all be considered.

While prevention and detection are always important, incident response formulas should be equally as critical in the overall security scheme of the organization.


The value of flexibility is clearly unique to each customer, and the measure of its value varies from organization to organization. There are multiple scenarios in which a customer might choose to implement Resilient and later realize additional uses and business opportunities, including:

  • Resilient incident response is agnostic with the tools that it integrates with and orchestrates. As newer and more capable prevention, detection, logging, and remediation tools are introduced to the security ecosystem, Resilient can continue to serve as the central orchestration mechanism with these tools and perpetually increase automation to security teams.

Flexibility would also be quantified when evaluated as part of a specific project (described in more detail in Appendix A).

License And Support Costs

From the interview, Forrester has determined that the majority of costs are borne from the following items:

  • The customer purchased a base software license, along with the user seat licenses for individual incident responders. The licensing purchased by the interviewed organization is of the perpetuity type.
  • Support and service were a continued cost assumed on a yearly basis following the initial year of usage.
  • Lastly, a development environment for the Resilient platform was necessary to develop integrations into the organization’s 50-plus existing security tools.

Costs in this study are represented at near list pricing, reflecting only slight discounting. Purchases of other IBM security solutions may drive the cost of the Resilient solution down beyond what is reflected here. We encourage readers to explore the options with IBM or partners.

Compiling the costs of the licenses and service and support, the interviewed organization likely assumed PV costs of $4,415,651 after three years of usage.

Initial And Ongoing Orchestration, Process, And Integration Build-Outs

The IBM Resilient platform can be deployed outright with minimal effort and comes with a number of standard dynamic playbooks. As no two organizations are the same, however, process remodeling and security tool integration need to be undertaken to fully realize the automation and orchestration capabilities of Resilient. The interviewed organization started integration and process augmentation for the mission-critical tools within its stack of more than 50 tools.

  • Initial planning and scripting of the various integrations required the efforts of five security FTEs over two weeks, committing a real total of 400 hours in this time. With this effort, the organization had integrated Resilient with its mission-critical and most commonly used tools. Automation savings almost immediately accrued, but the efforts for process engineering and tool integration didn’t stop there.
  • Over the next three years, the organization continued to integrate tools to continually improve the efficiency of incident response, cutting manual processes where it could. The effort spent by a Python developer equated to approximately half an FTE on an ongoing basis.
  • The result was a continued reduction and effectiveness in the organization’s ability to contain incidents in shorter periods of time.

Some organizations may lack the developer resources for advanced Python development in the security space; as such, there exists the risk that additional effort might need to be allocated to the tool integration process. Additionally, different organizations have varying complexities in their security architecture that may require additional effort. As such, Forrester has overlaid this cost category with what we identify as implementation risk.

To account for these risks, Forrester adjusted this cost upward by 10%, yielding a three-year risk-adjusted total PV of $266,745.

IBM Resilient: Overview

The following information is provided by IBM. Forrester has not validated any claims and does not endorse IBM or its offerings.

The Resilient Incident Response Platform (IRP) is the leading platform for orchestrating and automating incident response processes. With Resilient, security organizations can significantly drive down their mean time to find, respond to, and remediate using the platform. It quickly and easily integrates with organizations’ existing security and IT investments, creating a single hub to drive fast and intelligent action. The platform’s advanced orchestration capabilities enable adaptive response to complex cyber threats.

The latest orchestration innovations to the Resilient IRP include:

  • Dynamic Playbooks: Provides the agility, intelligence, and sophistication needed to contend with complex attacks. Dynamic Playbooks automatically adapts to real-time incident conditions and ensures repetitive, initial triage steps are complete before an analyst even opens the incident.
  • Visual Workflows: Enables analysts to orchestrate incident response with visually built, complex workflows based on tasks and technical integrations.
  • Incident Visualization: Graphically displays the relationships between incident artifacts or indicators of compromise (IOCs) and incidents in an organization’s environment.

Appendix A: Total Economic Impact

Total Economic Impact is a methodology developed by Forrester Research that enhances a company’s technology decision-making processes and assists vendors in communicating the value proposition of their products and services to clients. The TEI methodology helps companies demonstrate, justify, and realize the tangible value of IT initiatives to both senior management and other key business stakeholders.

Total Economic Impact Approach

  • Benefits represent the value delivered to the business by the product. The TEI methodology places equal weight on the measure of benefits and the measure of costs, allowing for a full examination of the effect of the technology on the entire organization.
  • Costs consider all expenses necessary to deliver the proposed value, or benefits, of the product. The cost category within TEI captures incremental costs over the existing environment for ongoing costs associated with the solution.
  • Flexibility represents the strategic value that can be obtained for some future additional investment building on top of the initial investment already made. Having the ability to capture that benefit has a PV that can be estimated.
  • Risks measure the uncertainty of benefit and cost estimates given: 1) the likelihood that estimates will meet original projections and 2) the likelihood that estimates will be tracked over time. TEI risk factors are based on “triangular distribution.”

The initial investment column contains costs incurred at “time 0” or at the beginning of Year 1 that are not discounted. All other cash flows are discounted using the discount rate at the end of the year. PV calculations are calculated for each total cost and benefit estimate. NPV calculations in the summary tables are the sum of the initial investment and the discounted cash flows in each year. Sums and present value calculations of the Total Benefits, Total Costs, and Cash Flow tables may not exactly add up, as some rounding may occur.