Hyperbolic Discounting in Hierarchical Reinforcement Learning

Published in Finding the Frame Workshop - RL Conference, 2025

Decisions often require balancing immediate gratification against long-term benefits. In Reinforcement Learning (RL), this balancing act is influenced by temporal discounting, which quantifies the devaluation of future rewards. Prior research indicates that human decision-making aligns more closely with hyperbolic discounting than the conventional exponential discounting used in RL. As artificial agents become more advanced and pervasive, particularly in multi-agent settings alongside humans, the need for appropriate discounting models becomes critical. Although hyperbolic discounting has been proposed for single-agent learning along with multi-agent reinforcement learning (MARL), it is still underexplored in more advanced settings such as the hierarchical reinforcement learning (HRL). We introduce and formulate hyperbolic discounting in HRL, establishing theoretical and practical foundations across various frameworks, including option critic and Feudal Networks methods. We evaluate hyperbolic discounting on diverse tasks, comparing it to the exponential discounting baseline. Our results show that hyperbolic discounting achieves higher returns in 50 of scenarios and performs on par with exponential discounting in 95 of tasks, with significant improvements in sparse reward and coordination-intensive environments. This work opens new avenues for robust decision-making processes in the development of advanced RL systems.

Recommended citation: Shabbar, Aya. (2025). Hyperbolic Discounting in Hierarchical Reinforcement Learning
Download Paper