Comprehensive Research Report: Critical Infrastructure Tabletop Exercises and Idaho National Lab’s ICS Methodologies
Executive briefing: Government research meets operational reality
Idaho National Laboratory has pioneered the Consequence-Driven Cyber-Informed Engineering (CCE) methodology that fundamentally transforms how critical infrastructure protects against cyber threats—starting from the assumption that determined adversaries WILL penetrate networks. This research compilation provides actionable intelligence on ICS/OT security for manufacturing, utilities, and converged operational technology environments, synthesizing government methodologies, real-world attack data, and proven tabletop exercise frameworks.
The stakes are quantifiable: manufacturing cyberattacks surged 105% in 2024, with average industrial breach costs reaching $5.56 million. Meanwhile, 70% of water utilities fail basic security standards, and sophisticated nation-state actors continue pre-positioning in U.S. critical infrastructure. Organizations deploying INL’s methodologies and structured tabletop exercises demonstrate measurably faster incident response, with some containing breaches in under 30 days—saving over $1 million compared to slower responders.
Idaho National Laboratory: The epicenter of ICS cybersecurity innovation
Idaho National Laboratory represents America’s premier institution for industrial control systems security, wielding over 20 years of expertise protecting the nation’s most critical infrastructure. The laboratory’s unique combination of technical prowess, world-class testing facilities, and collaborative government-industry partnerships positions it as the definitive authority on OT/ICS protection.
Consequence-Driven Cyber-Informed Engineering changes the paradigm
INL’s flagship CCE methodology represents a fundamental shift in critical infrastructure protection philosophy. Unlike traditional cybersecurity approaches that attempt to prevent all intrusions, CCE operates from a stark premise: if skilled adversaries target your critical infrastructure, they WILL penetrate your network. This assumption drives organizations to identify what truly matters—the critical functions whose compromise would cause catastrophic consequences—and engineer solutions that make those functions resilient even after network penetration.
The four-phase CCE process delivers measurable results. Organizations begin by identifying critical functions and analyzing high-consequence scenarios from an adversary’s perspective. They then develop engineered solutions to eliminate digital pathways to catastrophic outcomes, finally validating these solutions through rigorous testing. Since its $300,000 seed funding in 2016, CCE has attracted over $40 million in federal investment and conducted 35+ comprehensive engagements with top-tier utilities and defense installations.
Florida Power and Light’s pioneering CCE engagement demonstrated the methodology’s power, identifying specific engineering solutions that could stop digital attacks on essential power generation functions. A recent collaboration with a major natural gas company examined how disruptions ripple through electricity production, revealing critical dependencies that traditional security assessments miss.
INL’s arsenal of tools and training transforms industry capabilities
Malcolm, developed in partnership with CISA, exemplifies INL’s commitment to accessible, powerful security tools. This open-source network traffic analysis platform integrates 12+ specialized tools including Zeek, Arkime, and Suricata into a browser-based interface that analyzes high-volume network traffic to detect malicious activity. The German Federal Ministry of Interior is incorporating Malcolm into national cybersecurity initiatives, while the tool’s creator Seth Grover received the “Best Contribution to the Zeek Community” award.
The Cyber Security Evaluation Tool (CSET) enables organizations to conduct systematic evaluations of control system and IT security practices against industry standards. This free, desktop or web-based software guides asset owners through step-by-step assessments, documenting facility-specific hardware and software, comparing configurations to relevant security standards, and providing actionable recommendations. Organizations across all 16 critical infrastructure sectors use CSET for baseline assessments.
INL’s training programs have reached all 50 states and 110+ countries. The flagship 301 Course delivers four days of hands-on instruction at INL’s Idaho Falls facility, culminating in an intensive 7-hour Red Team versus Blue Team exercise where participants attack or defend real chemical batch mixing plants and electrical distribution SCADA systems. Over 200 sessions have trained thousands of cybersecurity professionals since 2007—all completely free to participants.
The innovative CyberStrike program brings Ukraine’s harsh lessons to American energy sector owners and operators. The LIGHTS OUT workshop simulates the 2015 and 2016 Ukraine power grid attacks, providing hands-on demonstrations of ICS-impacting cyber incidents. SHADOW VALVE adapts these scenarios for oil and natural gas sector operations. The advanced NEMESIS workshop examines tactics and techniques used by the most sophisticated adversary groups targeting ICS.
World-class facilities enable realistic testing and validation
The Cybercore Integration Center spans 80,000 square feet of state-of-the-art research space, housing 21 flexible electronics laboratories and configurable demonstration spaces. This joint investment by INL and the State of Idaho opened in fall 2019, enabling collaboration across federal agencies, private industry, and university partners. The facility’s Controls Environment Laboratory Resource (CELR) program hosts multiple concurrent simulations of real-world kinetic cyber-physical attacks across wastewater treatment, chemical processing, electrical distribution and transmission, and natural gas pipeline operations.
Organizations can remotely or on-site experience how threat actor tactics impact actual ICS/SCADA systems with functional kinetic outputs—watching cyber commands trigger physical consequences in real-time. This capability proves invaluable for red team testing, blue team training, vulnerability validation, and impact assessment without risking production systems.
INL maintains over 150,000 square feet dedicated to OT cybersecurity across its 890 square-mile desert site, including scalable test ranges and a specialized training facility with ICS cybersecurity-focused escape rooms. These immersive environments challenge teams to solve OT and IT problems collaboratively, testing teamwork, communication, and technical skills under pressure. The mobile “Insider Threat” escape room has deployed to high-profile events including Critical Effect in Washington DC and DEFCON in Las Vegas.
Government partnerships amplify national security impact
INL’s partnership with CISA’s Industrial Control Systems Cyber Emergency Response Team (ICS-CERT) provides operational support for responding to and analyzing control systems cyber incidents across all 18 U.S. critical infrastructure sectors. Updated in July 2025, this collaboration ensures ICS-CERT security operations centers receive staffing and expertise from the nation’s premier ICS security institution.
The Department of Energy’s Cyber Testing for Resilient Industrial Control Systems (CyTRICS) program unifies the nation’s energy supply chain analysis through standardized cyber testing and evaluation. Six DOE National Laboratories collaborate with industry giants including Rockwell Automation, Westinghouse Electric Company, GE Gas Power, Hitachi ABB Power Grids, and Schneider Electric to identify high-priority OT components, perform expert testing, and share vulnerability information across the digital supply chain.
INL’s leadership of the Cybersecurity Manufacturing Innovation Institute (CyManII) as Chief R&D Officer demonstrates commitment to making cybersecurity economically viable and pervasive in automation and supply chains. This Manufacturing USA initiative, led by the University of Texas at San Antonio, addresses the critical intersection of manufacturing competitiveness and cyber resilience.
The threat landscape: Sophisticated adversaries target operational technology
Manufacturing cyberattacks exploded 105% in 2024, with the sector accounting for 41% of all critical infrastructure attacks in the first half alone. Nearly 5,500 successful ransomware attacks struck manufacturing organizations globally, driven by a 71% surge in active threat actors. For the third consecutive year, manufacturing remains the most targeted critical infrastructure sector, with 22% of all attributed cyberattacks focusing on industrial production.
Colonial Pipeline reveals the cascading consequences of OT vulnerabilities
The May 2021 Colonial Pipeline ransomware attack demonstrated how IT compromises trigger operational shutdowns with national security implications. The DarkSide ransomware group gained access through a compromised VPN account password—a legacy, inactive account lacking multi-factor authentication, with credentials likely harvested from the dark web. Within hours of becoming aware of the May 7 breach, Colonial paid 75 bitcoin ($4.4 million) and proactively shut down its entire 5,500-mile pipeline system transporting nearly half of all fuel consumed on the East Coast.
The six-day shutdown created fuel shortages across 17 states and Washington D.C., with gas prices reaching the highest levels in over six years at $3.04 per gallon. Airport operations suffered disruptions at Charlotte Douglas and Hartsfield-Jackson Atlanta. The FBI eventually recovered 64 of 75 bitcoins (approximately $2.4 million), but the slow decryption tool meant paying the ransom didn’t guarantee quick recovery.
Critical lessons emerged: multi-factor authentication is non-negotiable for all remote access points. Network segmentation between IT and OT systems is essential—despite ransomware only hitting IT networks, Colonial chose to shut down operations out of abundance of caution. Incident response planning must include operational continuity procedures, and regular security audits of VPN accounts and access credentials prevent credential-based compromises.
The policy response was swift and comprehensive. TSA issued Security Directives for pipeline operators, President Biden signed Executive Order 14,028 on Improving the Nation’s Cybersecurity, and Congress passed the Cyber Incident Reporting for Critical Infrastructure Act (CIRCIA) in 2022. CISA expanded capabilities including the CyberSentry program for enhanced monitoring.
Ukraine attacks prove nation-states can cause kinetic damage through cyber means
Russia’s attacks on Ukraine’s power grid in 2015 and 2016 represent watershed moments in ICS cybersecurity—the first confirmed successful cyberattacks causing widespread power outages. The December 23, 2015 attack by Russia’s Sandworm Team demonstrated sophisticated multi-stage operations with eight months of reconnaissance using BlackEnergy3 malware, spear-phishing emails with malicious Microsoft Office attachments, and legitimate credential theft through VPN access.
Within a synchronized 30-minute window, attackers manually controlled breakers at three regional electric power distribution companies, switching off 30 substations at Prykarpattyaoblenergo alone. Approximately 225,000-230,000 customers lost power for 1-6 hours. The attackers deployed KillDisk malware to erase files and corrupt master boot records, overwriting Windows HMIs embedded in remote terminal units, corrupting serial-to-Ethernet device firmware at substations, and scheduling UPS systems for disconnect—all actions designed to interfere with restoration efforts.
The December 17, 2016 attack introduced Industroyer/Crashoverride, malware significantly more sophisticated than BlackEnergy3. Purpose-built for power grid disruption, Industroyer possessed native knowledge of industrial communication protocols including IEC 60870-5-101, IEC 60870-5-104, IEC 61850, and OPC DA. Its modular architecture enabled direct control of industrial equipment without relying on operator software, with greater stealth capabilities than any ICS malware since Stuxnet.
Russia attempted a third attack in April 2022 using an updated Industroyer variant targeting nine substations at a regional Ukrainian utility. Ukrainian authorities quickly detected and neutralized this assault, demonstrating that prepared defenders using threat intelligence can counter even sophisticated nation-state attacks—a lesson INL’s LIGHTS OUT workshop brings to American energy sector operators.
Manufacturing sector under siege from ransomware ecosystems
The manufacturing sector experienced brutal attacks throughout 2024-2025, with financial impacts reaching crisis levels. Average industrial sector data breach costs hit $5.56 million, representing an 18% increase from 2023. Average ransom payments in manufacturing reached $1,403,876—significantly higher than the $910,335 average in healthcare. Organizations require 199 days to identify breaches and 73 additional days to contain them, with total incident lifecycles of 272 days creating prolonged vulnerability windows.
RansomHub emerged as the most active ransomware group in 2024 after launching in February, claiming 78 manufacturing victims. Akira ransomware demonstrated a 665% volume increase through May 2024. LockBit 3.0 continued operations despite law enforcement actions. The Ransomware-as-a-Service (RaaS) ecosystem enables affiliates to conduct attacks using professionally developed malware, with Initial Access Brokers selling network access and double/triple extortion tactics (encrypt, steal data, threaten to leak) becoming standard.
High-profile 2024-2025 attacks illustrate the scope:
- Unimicron (January 2025): Global PCB production leader disrupted by ransomware affecting manufacturing sites
- Halliburton (August 2024): RansomHub attack shut down critical systems, disrupting operations
- Schneider Electric (January 2024): Cactus ransomware group stole 1.5TB of data
- Masimo (2025): Medical device manufacturer’s manufacturing, shipping, and support services disrupted with encrypted files
- National Presto Industries (March 2025): InterLock ransomware targeted defense contractor subsidiary
Nation-state Advanced Persistent Threat groups intensified OT targeting. Volt Typhoon (China) infiltrated U.S. energy, water, transportation, and communications systems, demonstrating “Living off the Land” techniques using legitimate tools while maintaining persistence for 5+ years undetected in some cases. APT28 (Russia) continues targeting OT/ICS environments with proven capability to cause physical damage. Cyber Av3ngers (Iran IRGC-affiliated) compromised 29 U.S. facilities using Unitronics PLCs in November 2023, targeting water/wastewater systems and defacing HMI screens.
Utilities face escalating threats to public safety
Utility sector attacks increased 70% in 2024, with 1,162 documented cyberattacks on power systems alone. The EPA’s 2024 assessment revealed shocking vulnerabilities: 97 water systems serving 26.6 million people have critical or high-risk vulnerabilities, with over 70% of systems failing basic security standards including unchanged default passwords and former employees retaining system access. A major attack on water infrastructure could cause $132 million per day in lost revenue beyond the incalculable public health impacts.
American Water (October 2024), the largest U.S. water utility serving 14 million people, suffered a cyberattack that forced its customer portal offline for one week with billing systems disrupted—though water operations continued safely. Texas water plants experienced attacks in January 2024 attributed to Russia-linked actors. Arkansas City, Kansas switched to manual operations in September 2024 without service disruption.
The power grid presents a massive attack surface with 200,000+ miles of transmission lines, thousands of generation plants, millions of digital controls, and 1,800+ entities owning or operating portions. 60 new vulnerability points are added daily as modernization efforts expand connectivity. SCADA systems with legacy protocols lacking authentication, remote access via cellular modems, default unchanged passwords, and internet-accessible ICONICS servers create numerous entry points for adversaries.
OT/IT convergence: The security implications of digital transformation
The digital transformation of industrial environments eliminates the historical “air gap” that once provided natural security through physical isolation. OT/IT convergence integrates information technology (managing data and business operations) with operational technology (controlling physical devices and industrial processes), driven by compelling business imperatives: real-time data analytics, remote operations capabilities, Industrial IoT adoption, operational efficiency gains, and cost optimization.
The global IT/OT convergence market is forecasted to exceed $1 trillion in 2027 and approach $1.3 trillion by 2030. Connected IoT devices are expected to reach 41.6 billion globally by end of 2025, with industrial applications representing significant growth. However, this integration creates profound security challenges that organizations must address systematically.
The fundamental security paradigm clash
OT and IT security operate from fundamentally different priorities, creating tension when traditional IT security approaches are applied to industrial environments. IT security prioritizes the CIA triad as Confidentiality → Integrity → Availability, focusing primarily on protecting sensitive data from unauthorized access, with data breaches and intellectual property theft as chief concerns. Scheduled downtime for maintenance and updates is generally acceptable in IT environments.
OT security inverts these priorities to Availability → Integrity → Confidentiality. Continuous operation is paramount—unplanned downtime causes significant financial losses, production delays, safety risks, or threats to human life. System availability and process integrity directly impact physical safety and operational continuity. Scheduled maintenance windows are difficult to obtain and may require extensive planning.
These differing priorities extend across the entire security lifecycle:
System Lifecycles: IT systems refresh every 3-5 years with regular patching standard practice, running modern operating systems with current security patches, designed for connectivity and frequent changes, with computational resources sufficient for modern security controls. OT systems span 15-30+ years with equipment designed before cybersecurity considerations, running outdated operating systems (Windows XP, Windows 7) no longer receiving security updates, using proprietary protocols with limited security features, and possessing limited computational resources that challenge implementing modern security controls.
Patching Approaches: IT environments enable automated, frequent patching (weekly or monthly) tested in staging environments before production deployment, with brief downtime acceptable for critical security updates and remote patching standard practice. OT environments require production shutdowns to patch PLCs, SCADA servers, or control systems, making patches operationally and financially costly due to downtime requirements. Extensive testing is required to ensure patches don’t disrupt physical processes or introduce cascading failures. Many vulnerabilities remain unpatched indefinitely—893 ICS vulnerabilities were disclosed in 2020 alone, representing a 24.72% increase from 2019.
Risk Tolerance: IT security risks lead to reputational damage, financial losses, and regulatory fines, with recovery through backups and disaster recovery systems and impact primarily on data and business operations. OT security risks cause immediate physical damage, equipment destruction, and environmental disasters, with safety implications including injuries or loss of life (47% of OT security incidents threaten human safety), production halts costing millions per hour of downtime, cascading failures across critical infrastructure, and difficult, slow recovery processes.
Network segmentation challenges in converged environments
The traditional Purdue Model for ICS security defines hierarchical levels from enterprise IT (Levels 4-5) through supervisory control (Level 3) to basic control (Levels 0-2). However, IT/OT convergence complicates this clear separation as business requirements drive greater integration, making strict segmentation difficult.
Many organizations lack proper separation between IT and OT networks, with dual-homed devices connected to both networks creating pathways for attackers. Flat OT network architectures without internal segmentation allow unrestricted lateral movement once attackers gain initial access. Misconfigurations in VLANs, firewalls, or DMZs create security gaps. According to SANS Institute, lateral movement from compromised enterprise IT networks represents the most common attack vector into ICS environments.
Best practices for segmentation include implementing zones (groupings of assets with similar security requirements) and conduits (controlled communication pathways between zones) as defined in IEC 62443. Organizations should deploy Demilitarized Zones (DMZ) between IT and OT networks for controlled data exchange, implement Next-Generation Firewalls with deep packet inspection capabilities, apply microsegmentation to limit east-west traffic within OT networks, and use unidirectional gateways (data diodes) for high-security applications that prevent network-based attacks with 99.9% effectiveness.
The era of Big Data and operational efficiency requirements has rendered pure air-gapping impractical despite its historical effectiveness. Remote monitoring and maintenance requirements, real-time data analytics needs, cloud-based applications and services, mobile workforce requirements, and third-party vendor access needs have systematically eliminated physical isolation.
Remote access emerges as critical vulnerability vector
Remote access has become one of the most common attack vectors in converged environments, as demonstrated by the Oldsmar Water Treatment Facility attack in 2021. Attackers gained unauthorized access through TeamViewer remote access software and attempted to increase sodium hydroxide levels from 100 ppm to 11,100 ppm—a potentially lethal concentration that an alert operator noticed and reversed, preventing public harm.
Common remote access issues include weak or default credentials (exemplified by the Unitronics incident), lack of multi-factor authentication, unsecured remote access software, shared passwords across multiple employees, inadequate monitoring and logging of remote sessions, direct access to control functions without intermediate security layers, and third-party vendor access without proper oversight.
Secure remote access best practices mandate enforcing multi-factor authentication for all remote access, using industrial-grade secure remote access solutions designed specifically for OT environments, implementing jump servers or secure access workstations, applying principle of least privilege, monitoring and logging all remote sessions comprehensively, implementing time-limited access with automatic session termination, conducting regular audits of remote access privileges, and separating vendor/contractor access from internal employee access.
OT environments face unique operational constraints
Real-time OT systems control physical processes operating with minimal tolerance for latency. Traditional IT security solutions generate traffic that OT systems cannot tolerate, with firewalls, intrusion detection systems, and endpoint security tools potentially introducing unacceptable latency. Active vulnerability scanning can disrupt or crash OT devices. Many OT protocols lack encryption because computational overhead would impact real-time performance.
Solutions include using passive network monitoring rather than active scanning, deploying OT-specific security solutions designed for real-time constraints, implementing out-of-band monitoring where possible, scheduling any active testing during planned maintenance windows, and using behavioral analytics and anomaly detection optimized for OT traffic patterns.
Safety implications distinguish OT security from IT security. Control system manipulation can cause equipment malfunctions leading to explosions, fires, or toxic releases. Incorrect sensor readings can trigger emergency shutdowns or unsafe operating conditions. Compromised safety instrumented systems (SIS) may fail to prevent hazardous events. Water treatment manipulation could poison public water supplies. Power grid disruptions affect hospitals, emergency services, and vulnerable populations.
OT environments often operate 24/7/365 with minimal tolerance for downtime. Manufacturing lines cannot be easily stopped for security maintenance. Power plants must maintain continuous generation. Water treatment facilities require uninterrupted operation. The average OT security breach causes 23 days of operational disruption with successful breaches causing average damages of $5.9 million per incident.
Tabletop exercises: Stress-testing incident response before crisis strikes
Tabletop exercises represent the most cost-effective method for validating incident response capabilities, building cross-functional collaboration, and identifying critical gaps before real attacks occur. Organizations containing security breaches within 30 days save over $1 million compared to those taking longer—making the modest investment in tabletop exercises deliver exceptional ROI.
SANS Institute’s top five ICS incident response scenarios
The SANS Institute, through analysis of all known ICS cyberattacks, developed five critical scenarios every industrial organization should rehearse:
Scenario 1: Living off the Land - Native ICS Protocol Abuse tests how organizations detect and respond when adversaries abuse legitimate ICS protocols (OPC, IEC104, Modbus/TCP, DNP3, ICCP) rather than introducing obvious malware. Engineering teams observe unusual communication patterns with abnormal scanning rates from SCADA servers to outstations, mimicking the CRASHOVERRIDE malware framework. Discussion focuses on ICS protocol baselining and Network Security Monitoring visibility at Purdue Levels 0-3, requiring deep packet inspection and protocol dissectors that differ fundamentally from IT monitoring tools.
Scenario 2: Human Machine Interface Hijack simulates HMI operators noticing unauthorized on-screen mouse movement and clicking of control buttons inconsistent with normal operations. This tests remote access account management, multi-factor authentication implementation, and network architecture adherence to the Purdue Model. Response requires disabling remote access, running ICS from plant floor embedded HMIs, and enabling islanding from Internet/IT networks—procedures that must be documented and rehearsed.
Scenario 3: Physical Access to Cyber Access Event integrates physical and cybersecurity response when physical security notices perimeter breaches at remote facilities and attackers pivot to cyber attacks by introducing malware into control networks. This scenario requires coordination between physical security, engineering, cybersecurity, and safety teams—groups that rarely train together. Response includes deploying trucks to remote sites, law enforcement involvement, and addressing threats to both workers and adversaries in dangerous industrial environments.
Scenario 4: Ransomware on IT or ICS/OT Networks remains the most likely scenario organizations will face. ICS operator workstations become infected with ransomware, rendering systems inoperable for viewing or controlling industrial processes. Discussion centers on dependencies between ICS and IT, network segmentation capabilities, and whether organizations can run ICS in manual mode from embedded HMIs. Protection includes email security, application whitelisting on ICS endpoints, and proper IT-ICS network segmentation with containment through network isolation.
Scenario 5: IT or ICS Network Pivot through Trusted Connections targets Data Historians as pivot points from IT to ICS networks, using compromised IT Active Directory credentials for access. This represents the most common real-world attack pathway identified by SANS. Protection requires network segmentation, separate untrusted Active Directory for IT and ICS, access control technologies, and regular log monitoring of Data Historians. Response includes limiting connectivity, investigating exfiltration patterns, and examining Command & Control communications.
BONUS Scenario 6: Contaminated Transient Device addresses infected USB devices or laptops plugged into Safety Instrumented System controllers during routine maintenance. Protection includes Network Access Control, ICS plant floor kiosks/scanners, isolated interrogation segments, and device scanning prior to connection—procedures that must balance security with operational needs.
CISA Tabletop Exercise Packages provide comprehensive frameworks
CISA offers over 100 customizable Tabletop Exercise Packages (CTEPs) covering industrial control systems compromise, ransomware, insider threats, and phishing, with sector-specific scenarios for elections infrastructure, local governments, maritime ports, water systems, and healthcare facilities. Each package includes template exercise objectives, scenario descriptions with discussion questions, exercise planning team organization guidance, comprehensive Situation Manuals (SITMANs), facilitator and evaluator handbooks, exercise brief slide deck templates, participant feedback forms, after-action report templates, and invitation letter templates.
Organizations can request these free resources by contacting cisa.exercises@mail.cisa.dhs.gov. The packages follow the Homeland Security Exercise and Evaluation Program (HSEEP) methodology, providing standardized exercise design, development, evaluation, and improvement planning that serves as the national standard for preparedness exercises.
Best practices for conducting effective ICS tabletop exercises
Planning requires 2-5 days minimum, extending to 30 days for complex exercises. Key activities include selecting realistic scenarios based on threat intelligence and sector events, identifying appropriate teams and participants, and including as many team players and observers as practical to maximize organizational benefit.
Goal setting should specify whether exercises aim to test newly deployed technology, train new team members, validate or update ICS Incident Response Plans, meet compliance requirements (such as NERC-CIP-008-6 requiring exercises every 15 months), or adhere to safety requirements. Best practice recommends annual exercises aligned with budget cycles, with additional exercises after major system changes.
Essential participants include:
- CSIRT Team Manager with overall responsibility and authority
- Process/Control System Engineers providing subject matter expertise on architecture, operations, and impacts
- Network Administrators managing access control, patching, intrusion detection, and log analysis
- System Administrators for control system and IT administration
- Plant Manager with authority for operational interruption decisions, risk assessment, and funding
- IT Director/CIO/Chief Engineer coordinating resources and allocation
- Security Experts bringing cybersecurity expertise and forensics capability
- Legal Experts addressing compliance, evidence collection, and liability
- Public Relations managing communications and media
- Human Resources handling internal incident investigations
- Vendor Support Engineers providing technical expertise for specific equipment
- Safety Personnel integrating emergency response and safety systems
- Physical Security coordinating facility security integration
Typical exercises run 2-3 hours to 1-2 days, with extended hybrid or live exercises spanning several days. Structure includes introduction and threat landscape briefings, scenario presentation with progressive injects, discussion of response actions, evaluation against existing plans, and documentation of gaps and action items.
Critical elements include creating no-fault environments to encourage honest participation, non-attributional reporting, focus on team learning rather than individual assessment, realistic scenarios tailored to actual environments using actual equipment names, systems, and personnel roles.
Measuring tabletop exercise effectiveness and ROI
Quantitative metrics demonstrate clear value. Organizations containing security breaches within 30 days save over $1 million compared to those taking longer. The 2017 NotPetya attack cost Maersk $300 million in lost revenue, demonstrating catastrophic costs of unpreparedness. Colonial Pipeline’s 6-day shutdown in 2021 caused fuel shortages and economic disruption across the Eastern U.S. Organizations with rehearsed plans during the 2023 MOVEit vulnerability acted quickly while unprepared organizations were “left scrambling.”
Incident response time measured in minutes, hours, or days correlates directly with impact severity. Resource utilization efficiency shows how participants allocate staff, technologies, and tools during exercises, identifying overuse or misallocation that improves strategic resource management during real incidents. Compliance achievement tracks adherence to regulatory requirements and process compliance during exercises.
Qualitative evidence provides compelling support for tabletop exercises. Team-building and trust development among response personnel is “almost impossible to measure, but invaluable because it is the relationships built between responders in an emergency.” Cross-departmental relationships strengthen, with exercises sometimes providing first-time meetings of response assets who would coordinate during actual incidents.
Knowledge and capability improvements include roles and responsibilities clarity, asset awareness among responding parties, understanding of dependencies, and enhanced process familiarity. Healthcare studies showed statistically significant improvements in confidence levels two weeks post-exercise, with focus groups revealing three key categories: role clarity, adult learning, and organizational support. The approach proved cost-effective, context-specific, low resource requirement, and transferable to other organizations.
A public health study of 179 officials validated a 37-item performance measurement tool reliably measuring five functional capabilities: leadership and management, mass casualty care, communication, disease control and prevention, and surveillance and epidemiology—demonstrating internal consistency and criterion-related validity.
DOE/INL programs demonstrate advanced exercise capabilities
Idaho National Laboratory’s 301 Course represents the gold standard for hands-on ICS cybersecurity training, conducting 200+ sessions since 2007 with participants from all 50 states and 110+ countries. The four-day program culminates in a 7-hour Red Team versus Blue Team exercise where participants attack or defend real chemical batch mixing plants and electrical distribution substation SCADA systems in a live environment.
The CyberStrike LIGHTS OUT workshop simulates the 2015 and 2016 Ukraine cyber incidents with hands-on demonstrations of cyberattacks impacting operational technology, tailored for electric sector owners and operators. SHADOW VALVE provides the oil and natural gas sector version with separation process control scenarios. NEMESIS offers advanced training examining tactics, techniques, and procedures used by the most sophisticated adversary groups targeting ICS.
The Liberty Eclipse Program provides a cyber-physical testbed with real electrons moving through components at scale, offering hands-on training for OT and cybersecurity experts using actual power systems with components commonly found at electric substations. The program has proven “tremendously successful in reducing risk to energy sector” and includes G7 international exercises for cross-border incident preparedness.
Regulatory frameworks and government programs strengthen critical infrastructure
The regulatory landscape for critical infrastructure combines sector-specific requirements with cross-sector frameworks, creating comprehensive but complex compliance obligations that organizations must navigate while maintaining operational effectiveness.
IEC 62443 provides the definitive OT security standard
IEC 62443 represents the most comprehensive international standard specifically designed for industrial automation and control systems (IACS) security. The multi-part standard addresses general concepts (Part 1), policies and procedures including Cybersecurity Management Systems and risk assessment (Part 2), system-level technical security requirements with network segmentation and security levels (Part 3), and product development requirements with secure development lifecycle for vendors (Part 4).
Core principles include risk-based approaches prioritizing security investments based on actual risk assessments, defense-in-depth with multiple layers of security controls protecting critical assets, and security levels (SL 1-4) providing graduated requirements based on threat sophistication. SL 1 protects against casual or coincidental violation. SL 2 addresses intentional violation using simple means. SL 3 counters sophisticated attacks with moderate resources. SL 4 defeats advanced attacks with extended resources.
The framework’s zones and conduits approach implements network segmentation using zones (groups of assets with similar security requirements) and conduits (controlled communication paths between zones). Lifecycle integration ensures security considerations throughout design, development, deployment, operation, and decommissioning phases.
NERC CIP mandates protection for bulk electric system
North American Electric Reliability Corporation Critical Infrastructure Protection (NERC CIP) standards establish mandatory cybersecurity requirements for the bulk electric system. CIP-008-6 specifically addresses Cyber Security Incident Response Plan Implementation and Testing, requiring documented response plans, testing every 15 months through tabletop exercises or operational exercises, and comprehensive documentation of lessons learned.
The standards cover CIP-002 through CIP-009, addressing BES Cyber System security with incident reporting and response requirements mandatory for entities owning or operating bulk power system assets. Compliance verification occurs through audits, with potential penalties for non-compliance reaching millions of dollars.
NIST frameworks provide technical implementation guidance
NIST SP 800-82 Revision 3: Guide to Operational Technology Security published in September 2023 provides comprehensive guidance on OT system topologies, typical threats to OT mission and business functions, vulnerabilities in OT environments, recommended security safeguards and countermeasures, risk management approaches, and tailored security control baselines for low, moderate, and high-impact systems. The OT overlay for SP 800-53r5 security controls provides ICS-specific applications, with alignment to the NIST Cybersecurity Framework.
The NIST Cybersecurity Framework organizes security activities into five core functions that adapt well to OT environments: Identify (asset inventory, risk assessment, governance), Protect (access controls, security awareness, protective technology), Detect (continuous monitoring, anomaly detection, threat intelligence), Respond (response planning, communications, analysis, mitigation), and Recover (recovery planning, improvements, communications).
CISA programs coordinate national critical infrastructure protection
CISA’s Industrial Control Systems Cyber Emergency Response Team (ICS-CERT) serves as the primary federal coordination center for ICS cybersecurity incidents. The team responds to incidents across all 18 critical infrastructure sectors, provides technical assistance to asset owners and operators, conducts vulnerability assessments, and disseminates threat intelligence through advisories and alerts.
The Joint Cyber Defense Collaborative for Industrial Control Systems (JCDC-ICS) brings together government agencies and private sector partners including Bechtel, Claroty, Dragos, GE, Honeywell, Nozomi Networks, Schneider Electric, Schweitzer Engineering Laboratories, Siemens, and Xylem. This collaborative enables proactive cyber defense planning, threat intelligence sharing, and coordinated response to emerging threats.
CISA’s Cybersecurity Performance Goals (CPGs) provide cross-sector voluntary practices considered critical for reducing risks to critical infrastructure. These actionable practices help organizations prioritize investments in cybersecurity controls, with specific guidance for OT environments recognizing unique operational constraints.
Sector-specific requirements create compliance complexity
Transportation Security Administration (TSA) Security Directives mandate cybersecurity requirements for pipeline operators following Colonial Pipeline, with requirements including incident reporting, cybersecurity coordinator designation, vulnerability assessment reviews, and implementation of specific security controls.
Environmental Protection Agency (EPA) guidance for water sector under the Safe Drinking Water Act includes cybersecurity considerations in vulnerability assessments, though enforcement has been inconsistent. The 2024 EPA assessment revealing 97 water systems with critical vulnerabilities serving 26.6 million people prompted renewed enforcement focus.
Department of Energy (DOE) through the Office of Cybersecurity, Energy Security, and Emergency Response (CESER) provides leadership for energy sector cybersecurity, including funding programs like the $39 million for distributed energy security, research initiatives through national laboratories, and partnership programs with utilities.
Building cyber resilience through military-grade methodologies
Organizations protecting critical infrastructure face sophisticated adversaries with nation-state capabilities, criminal ransomware ecosystems, and ideologically-motivated hacktivists. Military-grade methodologies pioneered by Idaho National Laboratory and refined through government research provide the proven frameworks necessary to defend against these advanced threats while maintaining operational effectiveness.
The consequence-driven approach prioritizes what matters most
Traditional cybersecurity focuses on preventing network intrusions—an increasingly futile goal against determined adversaries with unlimited time and resources. INL’s CCE methodology accepts that network penetration is inevitable and focuses instead on engineering solutions that prevent catastrophic consequences even after adversaries compromise networks.
This paradigm shift delivers measurable value. Organizations implementing CCE identify essential critical functions whose compromise would cause unacceptable consequences—whether loss of life, environmental disasters, prolonged service disruptions, or cascading infrastructure failures. They then analyze these functions from an adversary’s perspective, identifying every digital pathway that could enable high-consequence events. Finally, they develop and validate engineering solutions that eliminate these pathways or make them resilient to cyber manipulation.
Florida Power and Light’s pioneering CCE engagement demonstrated this approach’s power, revealing specific engineering changes that could prevent digital attacks from disrupting essential power generation functions regardless of network compromise levels. The methodology has attracted over $40 million in federal investment since 2018 seed funding, conducting 35+ comprehensive engagements with top-tier utilities and defense installations.
Defense-in-depth requires layered security controls
No single security control provides sufficient protection for critical infrastructure. Organizations must implement multiple layers of defense so that if one layer fails, others continue providing protection. This defense-in-depth strategy combines technical controls, operational procedures, and organizational governance.
Technical controls include network segmentation with zones and conduits following IEC 62443, firewalls and intrusion detection systems optimized for OT protocols, multi-factor authentication for all remote access, endpoint protection with application whitelisting on static systems, encryption for data in transit and at rest where operationally feasible, unidirectional gateways (data diodes) for high-security applications providing 99.9% prevention rates, and continuous network monitoring with OT-specific capabilities.
Operational procedures encompass vulnerability management with risk-based prioritization rather than relying solely on CVSS scores, patch management balancing security with operational constraints, change management with formal processes for all OT system modifications, access management following principle of least privilege, incident response plans specific to OT environments prioritizing safety and operational continuity, and regular security assessments including penetration testing during maintenance windows.
Organizational governance establishes cybersecurity policies and standards tailored to OT environments, risk management frameworks aligning with business objectives, security awareness training for both IT and OT personnel, vendor management programs with security requirements in contracts, executive leadership engagement with board-level oversight, and metrics and key performance indicators measuring security program effectiveness.
Zero Trust principles adapt to operational technology constraints
Zero Trust architecture based on “never trust, always verify” provides robust security for modern threat environments, though full implementation in OT requires careful consideration of operational constraints. Core Zero Trust principles adaptable to OT include comprehensive asset visibility across IT and OT with complete inventories of all devices, micro-segmentation limiting lateral movement between systems and zones, least privilege access controls granting only necessary permissions, continuous verification and monitoring of all activities and communications, and automated response capabilities where operationally safe.
Implementation must respect real-time operational requirements that may not tolerate authentication latency, legacy systems that cannot support modern authentication mechanisms, operational continuity needs that may require emergency access procedures, and safety systems that must function even during cybersecurity incidents. Organizations should pursue phased implementation starting with highest-risk areas, balancing security with operational usability throughout.
Cross-functional collaboration breaks down silos
One of the most effective security strategies is developing teams combining IT and OT expertise. Recommended team composition includes IT security professionals, OT engineers and operators, physical security personnel, risk management specialists, and business operations representatives. Benefits include bridging knowledge gaps between IT and OT domains, security strategies respecting both IT security requirements and OT operational constraints, improved communication and collaboration, and better incident response coordination.
Regular tabletop exercises bring these teams together, building trust and relationships before crises occur. The SANS Institute notes that team-building and trust development among response personnel is “almost impossible to measure, but invaluable because it is the relationships built between responders in an emergency.”
Continuous improvement requires measurement and adaptation
Organizations should establish metrics and key performance indicators demonstrating security program effectiveness and enabling continuous improvement. Recommended metrics include time to detect incidents (from breach to discovery), time to contain and remediate incidents (from discovery to resolution), number and severity of identified vulnerabilities (tracked over time), patch management metrics (percentage of critical systems patched within target timeframes), security control effectiveness (percentage of attacks detected/blocked), compliance posture (adherence to regulatory requirements and internal policies), security awareness (employee training completion rates and phishing test results), and incident response exercise frequency and identified improvements.
Regular assessment against frameworks like IEC 62443 security levels enables organizations to measure maturity progression and prioritize investments for maximum risk reduction. Organizations should conduct annual reviews of security strategies, updating them based on threat intelligence, incident lessons learned, technology changes, and business evolution.
Conclusion: Operationalizing research into resilient defense
Critical infrastructure protection requires synthesizing government research, threat intelligence, operational expertise, and proven methodologies into comprehensive defense programs that acknowledge both the sophistication of adversaries and the operational realities of industrial environments.
Idaho National Laboratory’s two decades of ICS cybersecurity leadership provide the foundational methodologies, tools, and training that organizations need. The CCE approach’s assumption that determined adversaries WILL penetrate networks—and focus on preventing catastrophic consequences despite compromise—represents the realistic threat model required for critical infrastructure protection.
Structured tabletop exercises using scenarios from SANS Institute, CISA CTEPs, and INL’s CyberStrike program enable organizations to validate incident response capabilities, identify gaps, and build critical cross-functional relationships before real attacks occur. The quantifiable ROI—over $1 million saved by containing breaches within 30 days versus slower responses—makes exercises among the most cost-effective security investments available.
Regulatory frameworks from NERC CIP, IEC 62443, NIST, and sector-specific requirements provide necessary structure while acknowledging OT environments’ unique operational constraints. Organizations must navigate these requirements while maintaining focus on actual risk reduction rather than mere compliance checkbox exercises.
The threat landscape continues intensifying. Manufacturing attacks surged 105% in 2024. Nation-state actors pre-position in critical infrastructure. Ransomware groups increasingly target OT for maximum impact. Water utilities overwhelmingly fail basic security standards. Yet organizations implementing military-grade methodologies, conducting regular exercises, deploying defense-in-depth controls, and fostering IT/OT collaboration demonstrate measurably superior cyber resilience.
The research is clear, the methodologies are proven, and the government resources are available—often at no cost to critical infrastructure operators. The question is no longer whether organizations have access to the knowledge and tools necessary for protection. The question is whether they will operationalize this research into action before the next Colonial Pipeline, Ukraine power grid attack, or worse.