Adaptive Discount Factor in Reinforcement learning

Overview

This Research project aims to study current formulation and shortcomings of Future discounting in Reinforcement Learning. The project aims to develop methodologies for making dynamic discounting factor to achieve state of the art sequential decisions’ learning process. Research will investigate the application of machine learning in modeling dynamic and state dependent discounting of future rewards.

In Reinforcement Learning, it is common for discount factor – γ to assign constant value ranging from 0 to 1 at the beginning of process and use constant discount factor’s exponential function throughout the training process. By making adaptive discount factor, we can make more rational decisions to gain more cumulative reward. This research will impact on every area which underlies with sequential decisions such as Robotics, Computer Systems, Natural Language Processing, Financial Analysis and Healthcare. As Artificial Intelligence is mimicking human intelligence into machine, This Research will also lead towards better understanding of Human Cognitive Processes and will gain better insight into Neuroscience

Current Team Members:

Milan V Zinzuvadiya
PI: Vahid Behzadan

Tools and Datasets:

N/A – In the future

Publications:

Zinzuvadiya, M., Behzadan, V.(2021). State-Wise Adaptive Discounting from Experience (SADE): A Novel Discounting Scheme for Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2021.

SAIL Lab

Adaptive Discount Factor in Reinforcement learning

AI Safety & Security