Hierarchical subtask discovery with non-negative matrix factorization

Earle, AC; Saxe, AM; Rosman, Benjamin S

dc.contributor.author	Earle, AC
dc.contributor.author	Saxe, AM
dc.contributor.author	Rosman, Benjamin S
dc.date.accessioned	2017-10-02T09:53:27Z
dc.date.available	2017-10-02T09:53:27Z
dc.date.issued	2017-08
dc.identifier.citation	Earle, A.C., Saxe, A.M. and Rosman, B.S. 2017. Hierarchical subtask discovery with non-negative matrix factorization. Lifelong Learning: A Reinforcement Learning Approach Workshop, August 2017, ICML, Sydney, Australia	en_US
dc.identifier.uri	https://arxiv.org/abs/1708.00463
dc.identifier.uri	https://arxiv.org/pdf/1708.00463v1.pdf
dc.identifier.uri	https://www.researchgate.net/publication/318868276_Hierarchical_Subtask_Discovery_With_Non-Negative_Matrix_Factorization
dc.identifier.uri	http://hdl.handle.net/10204/9623
dc.description	Lifelong Learning: A Reinforcement Learning Approach Workshop, August 2017, ICML, Sydney, Australia	en_US
dc.description.abstract	Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	Worklist;19465
dc.subject	Subtask discoveries	en_US
dc.subject	Reinforcement learning	en_US
dc.subject	Hierarchies	en_US
dc.title	Hierarchical subtask discovery with non-negative matrix factorization	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Earle, A., Saxe, A., & Rosman, B. S. (2017). Hierarchical subtask discovery with non-negative matrix factorization. http://hdl.handle.net/10204/9623	en_ZA
dc.identifier.chicagocitation	Earle, AC, AM Saxe, and Benjamin S Rosman. "Hierarchical subtask discovery with non-negative matrix factorization." (2017): http://hdl.handle.net/10204/9623	en_ZA
dc.identifier.vancouvercitation	Earle A, Saxe A, Rosman BS, Hierarchical subtask discovery with non-negative matrix factorization; 2017. http://hdl.handle.net/10204/9623 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Earle, AC AU - Saxe, AM AU - Rosman, Benjamin S AB - Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. DA - 2017-08 DB - ResearchSpace DP - CSIR KW - Subtask discoveries KW - Reinforcement learning KW - Hierarchies LK - https://researchspace.csir.co.za PY - 2017 T1 - Hierarchical subtask discovery with non-negative matrix factorization TI - Hierarchical subtask discovery with non-negative matrix factorization UR - http://hdl.handle.net/10204/9623 ER -	en_ZA

Files in this item

Name: Earle_19465_2017.pdf

Size: 1.264Mb

Format: PDF

Description: Conference paper

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.