Paper Detail

Learning Red Agent Policy from Observations for Neurosymbolic Autonomous Cyber Agents

Ankita Samaddar, Sandeep Neema, Daniel Balasubramanian, Xenofon Koutsoukos

arxiv Score 11.3

Published 2026-06-16 · First seen 2026-06-17

General AI

Abstract

With sophisticated cyber-attacks becoming increasingly prevalent, modern networks require intelligent autonomous cyber-defense agents trained via Reinforcement Learning (RL). These agents employ neurosymbolic approaches such as behavior trees with learning-enabled components (LECs) to learn, reason, adapt, and implement security rules while maintaining critical operations. However, these autonomous networks are partially observable systems, i.e., the cyber-attacker's (red agent's) actions are not observable, making it difficult for the defender to predict red actions, learn red policies, or assess the attacker's intrusion levels. To address this, we propose a Policy Learning Technique using imitation learning to learn policies for partially observable RL agents with discrete states and discrete actions. We apply this technique in an autonomous cyber environment to predict red agent's actions from network observations and defender actions. Integrated with a neurosymbolic cyber-defense agent, our method effectively handles different red policies and achieves high prediction accuracy across diverse simulated scenarios.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{samaddar2026learning,
  title = {Learning Red Agent Policy from Observations for Neurosymbolic Autonomous Cyber Agents},
  author = {Ankita Samaddar and Sandeep Neema and Daniel Balasubramanian and Xenofon Koutsoukos},
  year = {2026},
  abstract = {With sophisticated cyber-attacks becoming increasingly prevalent, modern networks require intelligent autonomous cyber-defense agents trained via Reinforcement Learning (RL). These agents employ neurosymbolic approaches such as behavior trees with learning-enabled components (LECs) to learn, reason, adapt, and implement security rules while maintaining critical operations. However, these autonomous networks are partially observable systems, i.e., the cyber-attacker's (red agent's) actions are no},
  url = {https://arxiv.org/abs/2606.18223},
  keywords = {cs.CR, cs.AI, cs.LG, eess.SY},
  eprint = {2606.18223},
  archiveprefix = {arXiv},
}

Metadata

{}