Using Deep Reinforcement Learning for Assessing the Consequences of Cyber Mitigation Techniques on Industrial Control Systems
Keywords:Deep reinforcement leaerning, APT, ICS, cyber mitigation, AI based automated attacks, ICS cyber-attacks, energy security, cybersecurity
This paper discusses an in-progress study involving the use of deep reinforcement learning (DRL) to mitigate the effects of an advanced cyber-attack against industrial control systems (ICS). The research is a qualitative, exploratory study which emerged as a gap during the execution of two rapid prototyping studies. During these studies, cyber defensive procedures, known as “Mitigation, were characterized as actions taken to minimize the impact of ongoing advanced cyber-attacks against an ICS while enabling primary operations to continue. To execute Mitigation procedures, affected ICS components required rapid isolation and quarantining from “healthy” system segments. However today, with most attacks leveraging automation, mitigation also requires rapid decision-making capabilities operating at the speed of automation yet with human-like refinement. The authors settled on the choice of DRL as a viable solution to this problem due to the algorithm’s designs which involves “intelligent” decisions based upon continuous learning achieved through a rewards system. The primary theory of this study posits that processes informed by data sources relative to the execution path of an advanced cyber-attack as well as the consequences of deploying a particular Mitigation procedure evolve the system into an ever-improving defensive capability. This study seeks to produce a defensive DLR based software agent trained by a DRL based offensive software agent that generates policy refinements based upon extrapolations from a corrupted network state as reported by an IDS and baseline data. Results include an estimation rule that would quantify impacts of various mitigation actions while protecting the operational critical path and isolating an in-progress attack. This study is in a conceptual phase and development has not started.
This research questions for this study are:
RQ1: Can this software agent categorize correctly an in-progress cyber-attack and extrapolate the potential ICS assets affected?
RQ2: Can this software agent categorize novel cyber-attacks and extrapolate a probable attack vector while enumerating affected assets?
RQ3: Can this software agent characterize how operations are affected by quarantine actions?
RQ4: Can this software agent generate a set of ranked recommended courses of action by effectiveness, and least negative effects on the operational critical path?
Copyright (c) 2023 Terry Merz, Romarie Morales Rosado
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.