ADVERSARIAL ROBUSTNESS OF REINFORCEMENT LEARNING

Overview

Since the emergence of Deep Reinforcement Learning (DRL) algorithms, there has been a surge of interest from both research and industry in the promising potential of this paradigm. The current and envisioned applications of deep RL range from autonomous navigation and robotics to control applications in critical infrastructure, air traffic control, defense technologies, and cybersecurity. Despite the extensive opportunities and benefits of deep RL algorithms, the security risks and challenges associated with them remain largely unexplored. Recent studies have highlighted the vulnerability of DRL algorithms to small perturbations in their state observations, which can be exploited by adversaries to manipulate the behavior and performance of DRL agents. This project aims to advance the current state of the art in three distinct but interconnected areas:

  1. Developing techniques and metrics to evaluate the resilience and robustness of DRL agents against adversarial perturbations of state, reward, and actuation.
  2. Developing tools and techniques for efficient and guaranteed mitigation of adversarial attacks against RL agents.
  3. Addressing the challenges of policy extraction and inversion to enable the protection of models and intellectual property rights.
 
 
Current Team Members:

Venu Korada
PI: Vahid Behzadan

Affiliate Research Groups:

Tools and Datasets:

RLAttack: Framework for experimental analysis of adversarial example attacks on policy learning in Deep RL.

Publications:

  1. Behzadan, V. & Hsu, W. (2019). Analysis and Improvement of Adversarial Training in DQN Agents With Adversarially-Guided Exploration (AGE). Proceedings of the 2nd International Workshop on Artificial Intelligence Safety Engineering (WAISE 2019), Turku, Finland, September 10, 2019.
  2. Behzadan, V. & Hsu, W. (2019). Adversarial Exploitation of Policy Imitation. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) Workshop on Artificial Intelligence Safety (AISafety 2019), Macau, China, August 11, 2019. 
  3. Behzadan, V., & Hsu, W. (2019). RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies. arXiv preprint arXiv:1906.01110.
  4. Behzadan, V., & Hsu, W. (2019). Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) Workshop on Artificial Intelligence Safety (AISafety 2019), Macau, China, August 11, 2019. 
  5. Behzadan, V. (2019). Security of deep reinforcement learning (Doctoral dissertation).
  6. Behzadan, V., & Munir, A. (2018). The faults in our pi stars: Security issues and open challenges in deep reinforcement learning. arXiv preprint arXiv:1810.10369.
  7. Behzadan, V., & Munir, A. (2018, September). Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise. In International Conference on Computer Safety, Reliability, and Security (pp. 406-417). Springer, Cham.
  8. Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, AlexeyKurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko,Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley,Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks,Jonas Rauber, Rujun Long, Patrick McDaniel cleverhans v2.1.0: an Adversarial Machine Learning Library. arXiv preprint arXiv:1610.00768, 2018.
  9. Behzadan, V., & Munir, A. (2017). Whatever does not kill deep reinforcement learning, makes it stronger. arXiv preprint arXiv:1712.09344.
  10. Behzadan, V., & Munir, A. (2017, July). Vulnerability of deep reinforcement learning to policy induction attacks. In International Conference on Machine Learning and Data Mining in Pattern Recognition (pp. 262-275). Springer, Cham.

AI Safety & Security