*Result*: Adaptive algorithms for shaping behavior.
Nat Commun. 2019 Nov 5;10(1):4646. (PMID: 31690723)
IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3732-3740. (PMID: 31502993)
Nature. 2007 Jan 25;445(7126):406-9. (PMID: 17251974)
Elife. 2021 May 20;10:. (PMID: 34011433)
Nature. 2012 Mar 14;484(7392):62-8. (PMID: 22419153)
Nat Neurosci. 2014 Sep;17(9):1225-32. (PMID: 25086608)
Cognition. 2009 Mar;110(3):380-94. (PMID: 19121518)
PLoS One. 2014 Feb 10;9(2):e88678. (PMID: 24520413)
J Appl Behav Anal. 2011 Fall;44(3):559-69. (PMID: 21941385)
PLoS One. 2013 Dec 06;8(12):e83171. (PMID: 24349451)
Nat Neurosci. 2016 Dec;19(12):1672-1681. (PMID: 27694990)
Neuron. 2011 Oct 20;72(2):330-43. (PMID: 22017991)
Front Behav Neurosci. 2018 Mar 06;12:36. (PMID: 29559900)
Cognition. 1993 Jul;48(1):71-99. (PMID: 8403835)
Neuron. 2020 May 20;106(4):662-674.e5. (PMID: 32171388)
Proc Natl Acad Sci U S A. 2022 Dec 6;119(49):e2215352119. (PMID: 36442113)
*Further Information*
*Dogs and laboratory mice are commonly trained to perform complex tasks by guiding them through a curriculum of simpler tasks ('shaping'). What are the principles behind effective shaping strategies? Here, we propose a teacher-student framework for shaping behavior, where an autonomous teacher agent decides its student's task based on the student's transcript of successes and failures on previously assigned tasks. Using algorithms for Monte Carlo planning under uncertainty, we show that near-optimal shaping algorithms achieve a careful balance between reinforcement and extinction. Near-optimal algorithms track learning rate to adaptively alternate between simpler and harder tasks. Based on this intuition, we derive an adaptive shaping heuristic with minimal parameters, which we show is near-optimal on a sequence learning task and robustly trains deep reinforcement learning agents on navigation tasks that involve sparse, delayed rewards. Extensions to continuous curricula are explored. Our work provides a starting point towards a general computational framework for shaping behavior that applies to both animals and artificial agents.
(Copyright: © 2025 Tong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)*
*The authors have declared that no competing interests exist.*