- -

Constraint-aware learning of policies by demonstration

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Constraint-aware learning of policies by demonstration

Show simple item record

Files in this item

dc.contributor.author Armesto, Leopoldo es_ES
dc.contributor.author Moura, Joao es_ES
dc.contributor.author Ivan, Vladimir es_ES
dc.contributor.author Erden, Mustafa Suphi es_ES
dc.contributor.author Sala, Antonio es_ES
dc.contributor.author Vijayakumar, Sethu es_ES
dc.date.accessioned 2020-07-21T03:31:09Z
dc.date.available 2020-07-21T03:31:09Z
dc.date.issued 2018-12 es_ES
dc.identifier.issn 0278-3649 es_ES
dc.identifier.uri http://hdl.handle.net/10251/148358
dc.description.abstract [EN] Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion. es_ES
dc.description.sponsorship The author(s) disclosed receipt of the following financial support for the research, auth/orship, and/or publication of this article: This work was supported by the Spanish Ministry of Economy and the European Union (grant number DPI2016-81002-R (AEI/FEDER, UE)), the European Union Horizon 2020, as part of the project Memory of Motion - MEMMO (project ID 780684), and the Engineering and Physical Sciences Research Council, UK, as part of the Robotics and AI hub in Future AI and Robotics for Space - FAIR-SPACE (grant number EP/R026092/1), and as part of the Centre for Doctoral Training in Robotics and Autonomous Systems at Heriot-Watt University and the University of Edinburgh (grant numbers EP/L016834/1 and EP/J015040/1) es_ES
dc.language Inglés es_ES
dc.publisher SAGE Publications es_ES
dc.relation EPSRC/EP/R026092/1 es_ES
dc.relation University of Edinburgh/EP/L016834/1 es_ES
dc.relation University of Edinburgh/EP/J015040/1 es_ES
dc.relation AEI/DPI2016-81002-R es_ES
dc.relation.ispartof The International Journal of Robotics Research es_ES
dc.rights Reconocimiento - No comercial (by-nc) es_ES
dc.subject Direct policy learning es_ES
dc.subject Constrained motion es_ES
dc.subject Null-space policy es_ES
dc.subject Force/torque application es_ES
dc.subject.classification INGENIERIA DE SISTEMAS Y AUTOMATICA es_ES
dc.title Constraint-aware learning of policies by demonstration es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1177/0278364918784354 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/780684/EU es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Ingeniería de Sistemas y Automática - Departament d'Enginyeria de Sistemes i Automàtica es_ES
dc.description.bibliographicCitation Armesto, L.; Moura, J.; Ivan, V.; Erden, MS.; Sala, A.; Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration. The International Journal of Robotics Research. 37(13-14):1673-1689. https://doi.org/10.1177/0278364918784354 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1177/0278364918784354 es_ES
dc.description.upvformatpinicio 1673 es_ES
dc.description.upvformatpfin 1689 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 37 es_ES
dc.description.issue 13-14 es_ES
dc.relation.pasarela S\379547 es_ES
dc.contributor.funder University of Edinburgh es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder Engineering and Physical Sciences Research Council, Reino Unido es_ES


This item appears in the following Collection(s)

Show simple item record