Do LLM Agents Have Regret? A Case Study in Online Learning and Games
Abstract
Large language models (LLMs) have been increasingly employed for (interactive) decision-making, via the development of LLM-based autonomous agents. Despite their emerging successes, the performance of LLM agents in decision-making has not been fully investigated through quantitative metrics, especially in the multi-agent setting when they interact with each other, a typical scenario in real-world LLM-agent applications. To better understand the limits of LLM agents in these interactive environments, we propose to study their interactions in benchmark decision-making settings in online learning and game theory, through the performance metric of regret. We first empirically study the {no-<PRE_TAG>regret</POST_TAG>} behaviors of LLMs in canonical (non-stationary) online learning problems, as well as the emergence of equilibria when LLM agents interact through playing repeated games. We then provide some theoretical insights into the no-<PRE_TAG>regret</POST_TAG> behaviors of LLM agents, under certain assumptions on the supervised pre-training and the rationality model of human decision-makers who generate the data. Notably, we also identify (simple) cases where advanced LLMs such as GPT-4 fail to be no-<PRE_TAG>regret</POST_TAG>. To promote the no-<PRE_TAG>regret</POST_TAG> behaviors, we propose a novel unsupervised training loss of <PRE_TAG>regret-loss</POST_TAG>, which, in contrast to the supervised pre-training loss, does not require the labels of (optimal) actions. We then establish the statistical guarantee of generalization bound for <PRE_TAG>regret-loss</POST_TAG> minimization, followed by the optimization guarantee that minimizing such a loss may automatically lead to known <PRE_TAG>no-<PRE_TAG>regret</POST_TAG> learning algorithms</POST_TAG>. Our further experiments demonstrate the effectiveness of our <PRE_TAG>regret-loss</POST_TAG>, especially in addressing the above ``regrettable'' cases.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper