dc.contributor.author |
Armesto, Leopoldo
|
es_ES |
dc.contributor.author |
Moura, Joao
|
es_ES |
dc.contributor.author |
Ivan, Vladimir
|
es_ES |
dc.contributor.author |
Erden, Mustafa Suphi
|
es_ES |
dc.contributor.author |
Sala, Antonio
|
es_ES |
dc.contributor.author |
Vijayakumar, Sethu
|
es_ES |
dc.date.accessioned |
2020-07-21T03:31:09Z |
|
dc.date.available |
2020-07-21T03:31:09Z |
|
dc.date.issued |
2018-12 |
es_ES |
dc.identifier.issn |
0278-3649 |
es_ES |
dc.identifier.uri |
http://hdl.handle.net/10251/148358 |
|
dc.description.abstract |
[EN] Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion. |
es_ES |
dc.description.sponsorship |
The author(s) disclosed receipt of the following financial support for the research, auth/orship, and/or publication of this article: This work was supported by the Spanish Ministry of Economy and the European Union (grant number DPI2016-81002-R (AEI/FEDER, UE)), the European Union Horizon 2020, as part of the project Memory of Motion - MEMMO (project ID 780684), and the Engineering and Physical Sciences Research Council, UK, as part of the Robotics and AI hub in Future AI and Robotics for Space - FAIR-SPACE (grant number EP/R026092/1), and as part of the Centre for Doctoral Training in Robotics and Autonomous Systems at Heriot-Watt University and the University of Edinburgh (grant numbers EP/L016834/1 and EP/J015040/1) |
es_ES |
dc.language |
Inglés |
es_ES |
dc.publisher |
SAGE Publications |
es_ES |
dc.relation |
EPSRC/EP/R026092/1 |
es_ES |
dc.relation |
University of Edinburgh/EP/L016834/1 |
es_ES |
dc.relation |
University of Edinburgh/EP/J015040/1 |
es_ES |
dc.relation |
AEI/DPI2016-81002-R |
es_ES |
dc.relation.ispartof |
The International Journal of Robotics Research |
es_ES |
dc.rights |
Reconocimiento - No comercial (by-nc) |
es_ES |
dc.subject |
Direct policy learning |
es_ES |
dc.subject |
Constrained motion |
es_ES |
dc.subject |
Null-space policy |
es_ES |
dc.subject |
Force/torque application |
es_ES |
dc.subject.classification |
INGENIERIA DE SISTEMAS Y AUTOMATICA |
es_ES |
dc.title |
Constraint-aware learning of policies by demonstration |
es_ES |
dc.type |
Artículo |
es_ES |
dc.identifier.doi |
10.1177/0278364918784354 |
es_ES |
dc.relation.projectID |
info:eu-repo/grantAgreement/EC/H2020/780684/EU |
es_ES |
dc.rights.accessRights |
Abierto |
es_ES |
dc.contributor.affiliation |
Universitat Politècnica de València. Departamento de Ingeniería de Sistemas y Automática - Departament d'Enginyeria de Sistemes i Automàtica |
es_ES |
dc.description.bibliographicCitation |
Armesto, L.; Moura, J.; Ivan, V.; Erden, MS.; Sala, A.; Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration. The International Journal of Robotics Research. 37(13-14):1673-1689. https://doi.org/10.1177/0278364918784354 |
es_ES |
dc.description.accrualMethod |
S |
es_ES |
dc.relation.publisherversion |
https://doi.org/10.1177/0278364918784354 |
es_ES |
dc.description.upvformatpinicio |
1673 |
es_ES |
dc.description.upvformatpfin |
1689 |
es_ES |
dc.type.version |
info:eu-repo/semantics/publishedVersion |
es_ES |
dc.description.volume |
37 |
es_ES |
dc.description.issue |
13-14 |
es_ES |
dc.relation.pasarela |
S\379547 |
es_ES |
dc.contributor.funder |
University of Edinburgh |
es_ES |
dc.contributor.funder |
AGENCIA ESTATAL DE INVESTIGACION |
es_ES |
dc.contributor.funder |
Engineering and Physical Sciences Research Council, Reino Unido |
es_ES |