Result: Adaptive algorithms for shaping behavior.

Title:

Adaptive algorithms for shaping behavior.

Authors:

Tong WL; School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America.; The Kempner Institute for the Study of Natural and Artificial Intelligence, Allston, Massachusetts, United States of America., Murthy VN; The Kempner Institute for the Study of Natural and Artificial Intelligence, Allston, Massachusetts, United States of America.; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America.; Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America., Reddy G; Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America.; Joseph Henry Laboratories of Physics, Princeton University, Princeton, New Jersey, United States of America.; Physics and Informatics Laboratories, NTT Research, Inc., Sunnyvale, California, United States of America.

Source:

PLoS computational biology [PLoS Comput Biol] 2025 Sep 12; Vol. 21 (9), pp. e1013454. Date of Electronic Publication: 2025 Sep 12 (Print Publication: 2025).

Publication Type:

Journal Article

Language:

English

Journal Info:

Publisher: Public Library of Science Country of Publication: United States NLM ID: 101238922 Publication Model: eCollection Cited Medium: Internet ISSN: 1553-7358 (Electronic) Linking ISSN: 1553734X NLM ISO Abbreviation: PLoS Comput Biol Subsets: MEDLINE

Imprint Name(s):

Original Publication: San Francisco, CA : Public Library of Science, [2005]-

MeSH Terms:

Behavior, Animal*/physiology , Learning*/physiology , Algorithms*, Animals ; Mice ; Computational Biology ; Reinforcement, Psychology ; Dogs ; Monte Carlo Method ; Adaptive Algorithms

Comments:

Update of: bioRxiv. 2023 Dec 05:2023.12.03.569774. doi: 10.1101/2023.12.03.569774.. (PMID: 38106232)

References:

Curr Opin Neurobiol. 2022 Aug;75:102555. (PMID: 35617751)
Nat Commun. 2019 Nov 5;10(1):4646. (PMID: 31690723)
IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3732-3740. (PMID: 31502993)
Nature. 2007 Jan 25;445(7126):406-9. (PMID: 17251974)
Elife. 2021 May 20;10:. (PMID: 34011433)
Nature. 2012 Mar 14;484(7392):62-8. (PMID: 22419153)
Nat Neurosci. 2014 Sep;17(9):1225-32. (PMID: 25086608)
Cognition. 2009 Mar;110(3):380-94. (PMID: 19121518)
PLoS One. 2014 Feb 10;9(2):e88678. (PMID: 24520413)
J Appl Behav Anal. 2011 Fall;44(3):559-69. (PMID: 21941385)
PLoS One. 2013 Dec 06;8(12):e83171. (PMID: 24349451)
Nat Neurosci. 2016 Dec;19(12):1672-1681. (PMID: 27694990)
Neuron. 2011 Oct 20;72(2):330-43. (PMID: 22017991)
Front Behav Neurosci. 2018 Mar 06;12:36. (PMID: 29559900)
Cognition. 1993 Jul;48(1):71-99. (PMID: 8403835)
Neuron. 2020 May 20;106(4):662-674.e5. (PMID: 32171388)
Proc Natl Acad Sci U S A. 2022 Dec 6;119(49):e2215352119. (PMID: 36442113)

Grant Information:

R01 DC017311 United States DC NIDCD NIH HHS; RF1 NS128865 United States NS NINDS NIH HHS

Entry Date(s):

Date Created: 20250912 Date Completed: 20250919 Latest Revision: 20250922

Update Code:

20260130

PubMed Central ID:

PMC12448964

DOI:

10.1371/journal.pcbi.1013454

PMID:

40939015

Database:

MEDLINE

Further Information

*Dogs and laboratory mice are commonly trained to perform complex tasks by guiding them through a curriculum of simpler tasks ('shaping'). What are the principles behind effective shaping strategies? Here, we propose a teacher-student framework for shaping behavior, where an autonomous teacher agent decides its student's task based on the student's transcript of successes and failures on previously assigned tasks. Using algorithms for Monte Carlo planning under uncertainty, we show that near-optimal shaping algorithms achieve a careful balance between reinforcement and extinction. Near-optimal algorithms track learning rate to adaptively alternate between simpler and harder tasks. Based on this intuition, we derive an adaptive shaping heuristic with minimal parameters, which we show is near-optimal on a sequence learning task and robustly trains deep reinforcement learning agents on navigation tasks that involve sparse, delayed rewards. Extensions to continuous curricula are explored. Our work provides a starting point towards a general computational framework for shaping behavior that applies to both animals and artificial agents.
(Copyright: © 2025 Tong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)*

*The authors have declared that no competing interests exist.*

*Result*: Adaptive algorithms for shaping behavior.

*Further Information*

*Links*

*Additional functions*

Result: Adaptive algorithms for shaping behavior.

Further Information

Links

Additional functions