CISM Domain 4: Incident Management Complete Guide

📌 Domain 4 At a Glance

Incident Management accounts for 30% of the CISM exam (approximately 45 questions)—the second-largest domain. This domain covers establishing and managing incident response capabilities, business continuity planning, and disaster recovery to minimize business impact and ensure organizational resilience.

What is Incident Management?

Incident management is the process of detecting, analyzing, responding to, and recovering from information security incidents. It encompasses not just technical response, but also communication, coordination, legal considerations, and business continuity.

The goal is to minimize business impact by quickly containing incidents, eradicating threats, restoring normal operations, and learning from events to prevent recurrence.

Key Tasks in Domain 4

ISACA defines these critical tasks for incident management:

Establish and maintain an incident management program that aligns with business priorities
Develop and implement processes to ensure timely identification and reporting of incidents
Establish and maintain plans and procedures for incident response
Establish and maintain processes to investigate and document incidents
Establish and maintain communication plans for effective incident response
Conduct post-incident reviews to identify and implement lessons learned
Establish and maintain a business continuity plan (BCP) aligned with organizational strategy
Establish and maintain a disaster recovery plan (DRP)
Coordinate, conduct, and report results of BCP and DRP testing
Facilitate integration of information security requirements into BCP and DRP
Establish and maintain incident classification and categorization processes

Critical Concepts You Must Know

1. Incident Response Lifecycle

Incident response follows a structured process:

Phase 1: Preparation

Establish incident response team and roles
Develop response procedures and playbooks
Deploy monitoring and detection tools
Conduct training and simulations
Establish communication channels

Phase 2: Detection and Analysis

Identify potential incidents from alerts
Analyze to confirm actual incident
Determine scope and severity
Classify and prioritize incident
Assemble appropriate response team

Phase 3: Containment

Short-term containment (immediate threat mitigation)
Long-term containment (sustained control)
Preserve evidence for investigation
Prevent spread to other systems
Maintain business operations where possible

Phase 4: Eradication

Remove threat from environment
Identify and close attack vectors
Patch vulnerabilities exploited
Strengthen affected controls
Verify complete removal

Phase 5: Recovery

Restore systems to normal operation
Verify system integrity
Implement enhanced monitoring
Return to business as usual
Document recovery actions

Phase 6: Post-Incident Review

Conduct lessons learned session
Document timeline and actions taken
Identify improvement opportunities
Update procedures and controls
Communicate findings to stakeholders

                        Exam Tip: Questions often ask what should be done FIRST. In most scenarios, the answer is containment—stop the bleeding before investigating root cause. However, if the question emphasizes evidence preservation (e.g., potential legal action), evidence collection may take priority.
                    

2. Incident Classification and Prioritization

Incidents must be classified to determine appropriate response:

By Severity:

Critical: Severe business impact, immediate response required
High: Significant impact, response within hours
Medium: Moderate impact, response within 1-2 days
Low: Minor impact, scheduled response

By Type:

Unauthorized access or intrusion
Denial of service
Malware infection
Data breach or exposure
Insider threat
Physical security breach

Classification drives response priorities, escalation paths, notification requirements, and resource allocation.

3. Incident Response Team Structure

A typical incident response team includes:

Incident Response Manager: Coordinates overall response
Security Analysts: Technical investigation and analysis
IT Operations: System access and technical support
Legal Counsel: Legal implications and evidence handling
Public Relations: External communications
Human Resources: Employee-related incidents
Business Stakeholders: Business impact decisions
External Parties: Law enforcement, vendors (as needed)

Clear roles and responsibilities prevent confusion during high-stress incidents.

4. Evidence Collection and Forensics

Proper evidence handling is critical for legal proceedings:

Chain of Custody:

Document who collected, accessed, or transferred evidence
Record date, time, and purpose of each access
Maintain continuous accountability
Store evidence securely
Use tamper-evident containers

Order of Volatility: Collect most volatile evidence first:

CPU registers, cache
RAM contents
Network connections, running processes
Temporary file systems
Disk storage
Remote logging and monitoring data
Physical configuration, network topology
Archival media

⚠️ Critical Rule: Never work on original evidence. Always create forensic copies and work from those. Any analysis on original media can compromise legal admissibility.

5. Communication During Incidents

Effective communication is essential throughout incident response:

Internal Communications:

Incident response team coordination
Management notifications and updates
Affected business unit communications
Employee awareness and instructions

External Communications:

Law enforcement (when appropriate)
Regulatory bodies (as required)
Customers and partners (breach notifications)
Media and public (through designated spokesperson)
Vendors and service providers

Communication Best Practices:

Designate single point of contact for media
Prepare pre-approved messaging templates
Be transparent but avoid speculation
Meet legal notification deadlines
Document all communications

6. Business Continuity Planning (BCP)

BCP ensures organizational resilience during disruptions:

Business Impact Analysis (BIA):

Identify critical business functions
Determine Recovery Time Objectives (RTO)
Determine Recovery Point Objectives (RPO)
Assess financial and operational impact over time
Identify dependencies and single points of failure

Key Metrics:

RTO (Recovery Time Objective): Maximum acceptable downtime
RPO (Recovery Point Objective): Maximum acceptable data loss
MTD (Maximum Tolerable Downtime): Time before organization suffers irreparable harm
WRT (Work Recovery Time): Time to verify systems and resume operations

BCP Components:

Emergency response procedures
Crisis management team structure
Business recovery strategies
Alternative processing sites
Supply chain continuity
Communication plans

7. Disaster Recovery Planning (DRP)

DRP focuses specifically on restoring IT systems and operations:

Recovery Strategies:

Hot Site: Fully operational backup facility, immediate failover (highest cost, lowest RTO)
Warm Site: Partially equipped facility, requires some setup (moderate cost and RTO)
Cold Site: Empty facility with power and connectivity, requires full setup (lowest cost, highest RTO)
Mobile Site: Portable recovery facility
Cloud-Based: Virtual recovery in cloud infrastructure

Backup Strategies:

Full Backup: Complete copy of all data (slow, comprehensive)
Incremental: Only data changed since last backup (fast backup, slow restore)
Differential: Data changed since last full backup (moderate speed both ways)
Mirror: Real-time replication

Testing Types:

Tabletop Exercise: Discussion-based walkthrough (least disruptive)
Walkthrough: Step-by-step review of procedures
Simulation: Practice with simulated disaster
Parallel Test: Run backup systems alongside production
Full Interruption: Actual failover to backup (most realistic, most disruptive)

                        Remember: BCP and DRP must be tested regularly (at least annually) and updated whenever significant business or technical changes occur. Untested plans provide false confidence and often fail when actually needed.
                    

Common Exam Scenarios

Scenario 1: First Response

"A ransomware attack is detected. What should the incident response team do FIRST?"

Answer: Contain the incident by isolating affected systems. Prevent spread before investigating cause or planning recovery. However, if the question emphasizes potential legal action or forensics, evidence preservation might take priority.

Scenario 2: RTO vs RPO

"An organization can tolerate 4 hours of downtime but no more than 1 hour of data loss. Which is correct?"

Answer: RTO = 4 hours, RPO = 1 hour. RTO is about downtime; RPO is about data loss.

Scenario 3: BIA Priority

"What is the PRIMARY purpose of a Business Impact Analysis?"

Answer: Identify critical business functions and determine recovery priorities. The BIA drives all other BCP/DRP decisions by establishing what's most important to the organization.

Study Tips for Domain 4

Know the Response Phases: Understand the incident response lifecycle and what happens in each phase. Questions test sequence and priorities.
Master RTO and RPO: These concepts appear frequently. Remember: RTO = time to recover, RPO = data loss tolerance.
Understand Priorities: In most scenarios, containment comes before root cause analysis. Business continuity trumps forensics unless specifically asked about legal proceedings.
Know Testing Methods: Understand different DR testing approaches and when each is appropriate.
Focus on Management Perspective: Questions emphasize coordination, communication, and decision-making rather than technical forensics details.

Quick Reference: Key Terms

Incident: Unplanned event that threatens security or disrupts operations
RTO: Maximum acceptable downtime
RPO: Maximum acceptable data loss
BCP: Overall organizational resilience plan
DRP: IT-specific recovery plan
Hot Site: Fully operational backup facility
Cold Site: Empty facility requiring full setup
Chain of Custody: Evidence accountability documentation
BIA: Analysis identifying critical functions and recovery priorities

Master Domain 4 with Practice Questions

Domain 4 represents 30% of your exam. Practice incident scenarios and continuity planning questions to build confidence.