dc.contributor.author |
Earle, AC
|
|
dc.contributor.author |
Saxe, AM
|
|
dc.contributor.author |
Rosman, Benjamin S
|
|
dc.date.accessioned |
2017-10-02T09:53:27Z |
|
dc.date.available |
2017-10-02T09:53:27Z |
|
dc.date.issued |
2017-08 |
|
dc.identifier.citation |
Earle, A.C., Saxe, A.M. and Rosman, B.S. 2017. Hierarchical subtask discovery with non-negative matrix factorization. Lifelong Learning: A Reinforcement Learning Approach Workshop, August 2017, ICML, Sydney, Australia |
en_US |
dc.identifier.uri |
https://arxiv.org/abs/1708.00463
|
|
dc.identifier.uri |
https://arxiv.org/pdf/1708.00463v1.pdf
|
|
dc.identifier.uri |
https://www.researchgate.net/publication/318868276_Hierarchical_Subtask_Discovery_With_Non-Negative_Matrix_Factorization
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/9623
|
|
dc.description |
Lifelong Learning: A Reinforcement Learning Approach Workshop, August 2017, ICML, Sydney, Australia |
en_US |
dc.description.abstract |
Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.ispartofseries |
Worklist;19465 |
|
dc.subject |
Subtask discoveries |
en_US |
dc.subject |
Reinforcement learning |
en_US |
dc.subject |
Hierarchies |
en_US |
dc.title |
Hierarchical subtask discovery with non-negative matrix factorization |
en_US |
dc.type |
Conference Presentation |
en_US |
dc.identifier.apacitation |
Earle, A., Saxe, A., & Rosman, B. S. (2017). Hierarchical subtask discovery with non-negative matrix factorization. http://hdl.handle.net/10204/9623 |
en_ZA |
dc.identifier.chicagocitation |
Earle, AC, AM Saxe, and Benjamin S Rosman. "Hierarchical subtask discovery with non-negative matrix factorization." (2017): http://hdl.handle.net/10204/9623 |
en_ZA |
dc.identifier.vancouvercitation |
Earle A, Saxe A, Rosman BS, Hierarchical subtask discovery with non-negative matrix factorization; 2017. http://hdl.handle.net/10204/9623 . |
en_ZA |
dc.identifier.ris |
TY - Conference Presentation
AU - Earle, AC
AU - Saxe, AM
AU - Rosman, Benjamin S
AB - Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions.
DA - 2017-08
DB - ResearchSpace
DP - CSIR
KW - Subtask discoveries
KW - Reinforcement learning
KW - Hierarchies
LK - https://researchspace.csir.co.za
PY - 2017
T1 - Hierarchical subtask discovery with non-negative matrix factorization
TI - Hierarchical subtask discovery with non-negative matrix factorization
UR - http://hdl.handle.net/10204/9623
ER -
|
en_ZA |