Constraint-aware learning of policies by demonstration

Armesto, Leopoldo; Moura, Joao; Ivan, Vladimir; Erden, Mustafa Suphi; Sala, Antonio; Vijayakumar, Sethu

doi:10.1177/0278364918784354

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Constraint-aware learning of policies by demonstration

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Armesto;Moura;Ivan ...

Tamaño: 1.018Mb

Formato: PDF

Descripción: Versión editorial

Abrir

dc.contributor.author	Armesto, Leopoldo	es_ES
dc.contributor.author	Moura, Joao	es_ES
dc.contributor.author	Ivan, Vladimir	es_ES
dc.contributor.author	Erden, Mustafa Suphi	es_ES
dc.contributor.author	Sala, Antonio	es_ES
dc.contributor.author	Vijayakumar, Sethu	es_ES
dc.date.accessioned	2020-07-21T03:31:09Z
dc.date.available	2020-07-21T03:31:09Z
dc.date.issued	2018-12	es_ES
dc.identifier.issn	0278-3649	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/148358
dc.description.abstract	[EN] Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion.	es_ES
dc.description.sponsorship	The author(s) disclosed receipt of the following financial support for the research, auth/orship, and/or publication of this article: This work was supported by the Spanish Ministry of Economy and the European Union (grant number DPI2016-81002-R (AEI/FEDER, UE)), the European Union Horizon 2020, as part of the project Memory of Motion - MEMMO (project ID 780684), and the Engineering and Physical Sciences Research Council, UK, as part of the Robotics and AI hub in Future AI and Robotics for Space - FAIR-SPACE (grant number EP/R026092/1), and as part of the Centre for Doctoral Training in Robotics and Autonomous Systems at Heriot-Watt University and the University of Edinburgh (grant numbers EP/L016834/1 and EP/J015040/1)	es_ES
dc.language	Inglés	es_ES
dc.publisher	SAGE Publications	es_ES
dc.relation.ispartof	The International Journal of Robotics Research	es_ES
dc.rights	Reconocimiento - No comercial (by-nc)	es_ES
dc.subject	Direct policy learning	es_ES
dc.subject	Constrained motion	es_ES
dc.subject	Null-space policy	es_ES
dc.subject	Force/torque application	es_ES
dc.subject.classification	INGENIERIA DE SISTEMAS Y AUTOMATICA	es_ES
dc.title	Constraint-aware learning of policies by demonstration	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1177/0278364918784354	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/780684/EU/Memory of Motion/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/UKRI//EP%2FL016834%2F1/GB/EPSRC Centre for Doctoral Training in Robotics and Autonomous Systems (RAS) in Edinburgh/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/UKRI//EP%2FR026092%2F1/GB/Future AI and Robotics Hub for Space (FAIR-SPACE)/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/UKRI//EP%2FJ015040%2F1/GB/Heriot-Watt - Equipment Account/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//DPI2016-81002-R/ES/CONTROL AVANZADO Y APRENDIZAJE DE ROBOTS EN OPERACIONES DE TRANSPORTE/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Ingeniería de Sistemas y Automática - Departament d'Enginyeria de Sistemes i Automàtica	es_ES
dc.description.bibliographicCitation	Armesto, L.; Moura, J.; Ivan, V.; Erden, MS.; Sala, A.; Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration. The International Journal of Robotics Research. 37(13-14):1673-1689. https://doi.org/10.1177/0278364918784354	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1177/0278364918784354	es_ES
dc.description.upvformatpinicio	1673	es_ES
dc.description.upvformatpfin	1689	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	37	es_ES
dc.description.issue	13-14	es_ES
dc.relation.pasarela	S\379547	es_ES
dc.contributor.funder	University of Edinburgh	es_ES
dc.contributor.funder	UK Research and Innovation	es_ES
dc.contributor.funder	Engineering and Physical Sciences Research Council, Reino Unido	es_ES
dc.contributor.funder	Ministerio de Economía y Competitividad	es_ES
dc.description.references	Alissandrakis, A., Nehaniv, C. L., & Dautenhahn, K. (2007). Correspondence Mapping Induced State and Action Metrics for Robotic Imitation. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 37(2), 299-307. doi:10.1109/tsmcb.2006.886947	es_ES
dc.description.references	Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469-483. doi:10.1016/j.robot.2008.10.024	es_ES
dc.description.references	Armesto, L., Bosga, J., Ivan, V., & Vijayakumar, S. (2017). Efficient learning of constraints and generic null space policies. 2017 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2017.7989181	es_ES
dc.description.references	Armesto, L., Ivan, V., Moura, J., Sala, A., & Vijayakumar, S. (2017). Learning Constrained Generalizable Policies by Demonstration. Robotics: Science and Systems XIII. doi:10.15607/rss.2017.xiii.036	es_ES
dc.description.references	Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Artificial Intelligence Review, 11(1/5), 75-113. doi:10.1023/a:1006511328852	es_ES
dc.description.references	Baerlocher, P., & Boulic, R. (2004). An inverse kinematics architecture enforcing an arbitrary number of strict priority levels. The Visual Computer, 20(6), 402-417. doi:10.1007/s00371-004-0244-4	es_ES
dc.description.references	Calinon, S. (2015). A tutorial on task-parameterized movement learning and retrieval. Intelligent Service Robotics, 9(1), 1-29. doi:10.1007/s11370-015-0187-9	es_ES
dc.description.references	Calinon, S., & Billard, A. (2007). Incremental learning of gestures by imitation in a humanoid robot. Proceeding of the ACM/IEEE international conference on Human-robot interaction - HRI ’07. doi:10.1145/1228716.1228751	es_ES
dc.description.references	Cruse, H., & Brüwer, M. (1987). The human arm as a redundant manipulator: The control of path and joint angles. Biological Cybernetics, 57(1-2), 137-144. doi:10.1007/bf00318723	es_ES
dc.description.references	D’Souza, A., Vijayakumar, S., & Schaal, S. (s. f.). Learning inverse kinematics. Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180). doi:10.1109/iros.2001.973374	es_ES
dc.description.references	Escande, A., Mansard, N., & Wieber, P.-B. (2014). Hierarchical quadratic programming: Fast online humanoid-robot motion generation. The International Journal of Robotics Research, 33(7), 1006-1028. doi:10.1177/0278364914521306	es_ES
dc.description.references	Gams, A., Nemec, B., Ijspeert, A. J., & Ude, A. (2014). Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks. IEEE Transactions on Robotics, 30(4), 816-830. doi:10.1109/tro.2014.2304775	es_ES
dc.description.references	Gienger, M., Janssen, H., & Goerick, C. (s. f.). Task-oriented whole body motion for humanoid robots. 5th IEEE-RAS International Conference on Humanoid Robots, 2005. doi:10.1109/ichr.2005.1573574	es_ES
dc.description.references	Herzog, A., Rotella, N., Mason, S., Grimminger, F., Schaal, S., & Righetti, L. (2015). Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid. Autonomous Robots, 40(3), 473-491. doi:10.1007/s10514-015-9476-6	es_ES
dc.description.references	Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359-366. doi:10.1016/0893-6080(89)90020-8	es_ES
dc.description.references	Howard, M., Klanke, S., Gienger, M., Goerick, C., & Vijayakumar, S. (2009). A novel method for learning policies from variable constraint data. Autonomous Robots, 27(2), 105-121. doi:10.1007/s10514-009-9129-8	es_ES
dc.description.references	Hussein, M., Mohammed, Y., & Ali, S. A. (2015). Learning from Demonstration Using Variational Bayesian Inference. Lecture Notes in Computer Science, 371-381. doi:10.1007/978-3-319-19066-2_36	es_ES
dc.description.references	Khatib, O., Sentis, L., & Park, J.-H. (s. f.). A Unified Framework for Whole-Body Humanoid Robot Control with Multiple Constraints and Contacts. European Robotics Symposium 2008, 303-312. doi:10.1007/978-3-540-78317-6_31	es_ES
dc.description.references	Lin, H.-C., Howard, M., & Vijayakumar, S. (2015). Learning null space projections. 2015 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2015.7139551	es_ES
dc.description.references	Lin, H.-C., Ray, P., & Howard, M. (2017). Learning task constraints in operational space formulation. 2017 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2017.7989039	es_ES
dc.description.references	Mansard, N., & Chaumette, F. (2007). Task Sequencing for High-Level Sensor-Based Control. IEEE Transactions on Robotics, 23(1), 60-72. doi:10.1109/tro.2006.889487	es_ES
dc.description.references	Moura, J., & Erden, M. S. (2017). Formulation of a Control and Path Planning Approach for a Cab front Cleaning Robot. Procedia CIRP, 59, 67-71. doi:10.1016/j.procir.2016.09.024	es_ES
dc.description.references	Paraschos, A., Lioutikov, R., Peters, J., & Neumann, G. (2017). Probabilistic Prioritization of Movement Primitives. IEEE Robotics and Automation Letters, 2(4), 2294-2301. doi:10.1109/lra.2017.2725440	es_ES
dc.description.references	Pastor, P., Righetti, L., Kalakrishnan, M., & Schaal, S. (2011). Online movement adaptation based on previous sensor experiences. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. doi:10.1109/iros.2011.6095059	es_ES
dc.description.references	Schaal, S., & Atkeson, C. G. (1998). Constructive Incremental Learning from Only Local Information. Neural Computation, 10(8), 2047-2084. doi:10.1162/089976698300016963	es_ES
dc.description.references	Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431), 537-547. doi:10.1098/rstb.2002.1258	es_ES
dc.description.references	Shiller, Z. (2015). Off-Line and On-Line Trajectory Planning. Mechanisms and Machine Science, 29-62. doi:10.1007/978-3-319-14705-5_2	es_ES
dc.description.references	Siciliano B, Sciavicco L, Villani L, et al. (2009) Differential Kinematics and Statics. London: Springer, pp. 105–160.	es_ES
dc.description.references	Sugiura, H., Gienger, M., Janssen, H., & Goerick, C. (2006). Real-Time Self Collision Avoidance for Humanoids by means of Nullspace Criteria and Task Intervals. 2006 6th IEEE-RAS International Conference on Humanoid Robots. doi:10.1109/ichr.2006.321331	es_ES
dc.description.references	Towell, C., Howard, M., & Vijayakumar, S. (2010). Learning nullspace policies. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. doi:10.1109/iros.2010.5650663	es_ES
dc.description.references	Yoshikawa, T. (1985). Manipulability of Robotic Mechanisms. The International Journal of Robotics Research, 4(2), 3-9. doi:10.1177/027836498500400201	es_ES
dc.description.references	Zhang, X.-D. (2017). Matrix Analysis and Applications. doi:10.1017/9781108277587	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Constraint-aware learning of policies by demonstration

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Constraint-aware learning of policies by demonstration

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)