A reinforcement learning process in extensive form games

Abstract : The CPR ("cumulative proportional reinforcement") learning rule stipulates that an agent chooses a move with a probability proportional to the cumulative payoff she obtained in the past with that move. Previously considered for strategies in normal form games (Laslier, Topol and Walliser, Games and Econ. Behav., 2001), the CPR rule is here adapted for actions in perfect information extensive form games. The paper shows that the action-based CPR process converges with probability one to the (unique) subgame perfect equilibrium.
Type de document :
Article dans une revue
International Journal of Game Theory, Springer Verlag, 2005, 33 (2), pp.219-227. 〈10.1007/s001820400194〉
Liste complète des métadonnées

https://hal-pjse.archives-ouvertes.fr/halshs-00754083
Contributeur : Caroline Bauer <>
Soumis le : mardi 20 novembre 2012 - 08:42:35
Dernière modification le : jeudi 10 mai 2018 - 02:07:21

Lien texte intégral

Identifiants

Collections

Citation

Jean-François Laslier, Bernard Walliser. A reinforcement learning process in extensive form games. International Journal of Game Theory, Springer Verlag, 2005, 33 (2), pp.219-227. 〈10.1007/s001820400194〉. 〈halshs-00754083〉

Partager

Métriques

Consultations de la notice

258