- -

Constraint-aware learning of policies by demonstration

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Constraint-aware learning of policies by demonstration

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Armesto, Leopoldo es_ES
dc.contributor.author Moura, Joao es_ES
dc.contributor.author Ivan, Vladimir es_ES
dc.contributor.author Erden, Mustafa Suphi es_ES
dc.contributor.author Sala, Antonio es_ES
dc.contributor.author Vijayakumar, Sethu es_ES
dc.date.accessioned 2020-07-21T03:31:09Z
dc.date.available 2020-07-21T03:31:09Z
dc.date.issued 2018-12 es_ES
dc.identifier.issn 0278-3649 es_ES
dc.identifier.uri http://hdl.handle.net/10251/148358
dc.description.abstract [EN] Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion. es_ES
dc.description.sponsorship The author(s) disclosed receipt of the following financial support for the research, auth/orship, and/or publication of this article: This work was supported by the Spanish Ministry of Economy and the European Union (grant number DPI2016-81002-R (AEI/FEDER, UE)), the European Union Horizon 2020, as part of the project Memory of Motion - MEMMO (project ID 780684), and the Engineering and Physical Sciences Research Council, UK, as part of the Robotics and AI hub in Future AI and Robotics for Space - FAIR-SPACE (grant number EP/R026092/1), and as part of the Centre for Doctoral Training in Robotics and Autonomous Systems at Heriot-Watt University and the University of Edinburgh (grant numbers EP/L016834/1 and EP/J015040/1) es_ES
dc.language Inglés es_ES
dc.publisher SAGE Publications es_ES
dc.relation.ispartof The International Journal of Robotics Research es_ES
dc.rights Reconocimiento - No comercial (by-nc) es_ES
dc.subject Direct policy learning es_ES
dc.subject Constrained motion es_ES
dc.subject Null-space policy es_ES
dc.subject Force/torque application es_ES
dc.subject.classification INGENIERIA DE SISTEMAS Y AUTOMATICA es_ES
dc.title Constraint-aware learning of policies by demonstration es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1177/0278364918784354 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/780684/EU/Memory of Motion/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UKRI//EP%2FL016834%2F1/GB/EPSRC Centre for Doctoral Training in Robotics and Autonomous Systems (RAS) in Edinburgh/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UKRI//EP%2FR026092%2F1/GB/Future AI and Robotics Hub for Space (FAIR-SPACE)/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UKRI//EP%2FJ015040%2F1/GB/Heriot-Watt - Equipment Account/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//DPI2016-81002-R/ES/CONTROL AVANZADO Y APRENDIZAJE DE ROBOTS EN OPERACIONES DE TRANSPORTE/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Ingeniería de Sistemas y Automática - Departament d'Enginyeria de Sistemes i Automàtica es_ES
dc.description.bibliographicCitation Armesto, L.; Moura, J.; Ivan, V.; Erden, MS.; Sala, A.; Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration. The International Journal of Robotics Research. 37(13-14):1673-1689. https://doi.org/10.1177/0278364918784354 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1177/0278364918784354 es_ES
dc.description.upvformatpinicio 1673 es_ES
dc.description.upvformatpfin 1689 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 37 es_ES
dc.description.issue 13-14 es_ES
dc.relation.pasarela S\379547 es_ES
dc.contributor.funder University of Edinburgh es_ES
dc.contributor.funder UK Research and Innovation es_ES
dc.contributor.funder Engineering and Physical Sciences Research Council, Reino Unido es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.description.references Alissandrakis, A., Nehaniv, C. L., & Dautenhahn, K. (2007). Correspondence Mapping Induced State and Action Metrics for Robotic Imitation. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 37(2), 299-307. doi:10.1109/tsmcb.2006.886947 es_ES
dc.description.references Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469-483. doi:10.1016/j.robot.2008.10.024 es_ES
dc.description.references Armesto, L., Bosga, J., Ivan, V., & Vijayakumar, S. (2017). Efficient learning of constraints and generic null space policies. 2017 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2017.7989181 es_ES
dc.description.references Armesto, L., Ivan, V., Moura, J., Sala, A., & Vijayakumar, S. (2017). Learning Constrained Generalizable Policies by Demonstration. Robotics: Science and Systems XIII. doi:10.15607/rss.2017.xiii.036 es_ES
dc.description.references Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Artificial Intelligence Review, 11(1/5), 75-113. doi:10.1023/a:1006511328852 es_ES
dc.description.references Baerlocher, P., & Boulic, R. (2004). An inverse kinematics architecture enforcing an arbitrary number of strict priority levels. The Visual Computer, 20(6), 402-417. doi:10.1007/s00371-004-0244-4 es_ES
dc.description.references Calinon, S. (2015). A tutorial on task-parameterized movement learning and retrieval. Intelligent Service Robotics, 9(1), 1-29. doi:10.1007/s11370-015-0187-9 es_ES
dc.description.references Calinon, S., & Billard, A. (2007). Incremental learning of gestures by imitation in a humanoid robot. Proceeding of the ACM/IEEE international conference on Human-robot interaction - HRI ’07. doi:10.1145/1228716.1228751 es_ES
dc.description.references Cruse, H., & Brüwer, M. (1987). The human arm as a redundant manipulator: The control of path and joint angles. Biological Cybernetics, 57(1-2), 137-144. doi:10.1007/bf00318723 es_ES
dc.description.references D’Souza, A., Vijayakumar, S., & Schaal, S. (s. f.). Learning inverse kinematics. Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180). doi:10.1109/iros.2001.973374 es_ES
dc.description.references Escande, A., Mansard, N., & Wieber, P.-B. (2014). Hierarchical quadratic programming: Fast online humanoid-robot motion generation. The International Journal of Robotics Research, 33(7), 1006-1028. doi:10.1177/0278364914521306 es_ES
dc.description.references Gams, A., Nemec, B., Ijspeert, A. J., & Ude, A. (2014). Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks. IEEE Transactions on Robotics, 30(4), 816-830. doi:10.1109/tro.2014.2304775 es_ES
dc.description.references Gienger, M., Janssen, H., & Goerick, C. (s. f.). Task-oriented whole body motion for humanoid robots. 5th IEEE-RAS International Conference on Humanoid Robots, 2005. doi:10.1109/ichr.2005.1573574 es_ES
dc.description.references Herzog, A., Rotella, N., Mason, S., Grimminger, F., Schaal, S., & Righetti, L. (2015). Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid. Autonomous Robots, 40(3), 473-491. doi:10.1007/s10514-015-9476-6 es_ES
dc.description.references Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359-366. doi:10.1016/0893-6080(89)90020-8 es_ES
dc.description.references Howard, M., Klanke, S., Gienger, M., Goerick, C., & Vijayakumar, S. (2009). A novel method for learning policies from variable constraint data. Autonomous Robots, 27(2), 105-121. doi:10.1007/s10514-009-9129-8 es_ES
dc.description.references Hussein, M., Mohammed, Y., & Ali, S. A. (2015). Learning from Demonstration Using Variational Bayesian Inference. Lecture Notes in Computer Science, 371-381. doi:10.1007/978-3-319-19066-2_36 es_ES
dc.description.references Khatib, O., Sentis, L., & Park, J.-H. (s. f.). A Unified Framework for Whole-Body Humanoid Robot Control with Multiple Constraints and Contacts. European Robotics Symposium 2008, 303-312. doi:10.1007/978-3-540-78317-6_31 es_ES
dc.description.references Lin, H.-C., Howard, M., & Vijayakumar, S. (2015). Learning null space projections. 2015 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2015.7139551 es_ES
dc.description.references Lin, H.-C., Ray, P., & Howard, M. (2017). Learning task constraints in operational space formulation. 2017 IEEE International Conference on Robotics and Automation (ICRA). doi:10.1109/icra.2017.7989039 es_ES
dc.description.references Mansard, N., & Chaumette, F. (2007). Task Sequencing for High-Level Sensor-Based Control. IEEE Transactions on Robotics, 23(1), 60-72. doi:10.1109/tro.2006.889487 es_ES
dc.description.references Moura, J., & Erden, M. S. (2017). Formulation of a Control and Path Planning Approach for a Cab front Cleaning Robot. Procedia CIRP, 59, 67-71. doi:10.1016/j.procir.2016.09.024 es_ES
dc.description.references Paraschos, A., Lioutikov, R., Peters, J., & Neumann, G. (2017). Probabilistic Prioritization of Movement Primitives. IEEE Robotics and Automation Letters, 2(4), 2294-2301. doi:10.1109/lra.2017.2725440 es_ES
dc.description.references Pastor, P., Righetti, L., Kalakrishnan, M., & Schaal, S. (2011). Online movement adaptation based on previous sensor experiences. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. doi:10.1109/iros.2011.6095059 es_ES
dc.description.references Schaal, S., & Atkeson, C. G. (1998). Constructive Incremental Learning from Only Local Information. Neural Computation, 10(8), 2047-2084. doi:10.1162/089976698300016963 es_ES
dc.description.references Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431), 537-547. doi:10.1098/rstb.2002.1258 es_ES
dc.description.references Shiller, Z. (2015). Off-Line and On-Line Trajectory Planning. Mechanisms and Machine Science, 29-62. doi:10.1007/978-3-319-14705-5_2 es_ES
dc.description.references Siciliano B, Sciavicco L, Villani L, et al. (2009) Differential Kinematics and Statics. London: Springer, pp. 105–160. es_ES
dc.description.references Sugiura, H., Gienger, M., Janssen, H., & Goerick, C. (2006). Real-Time Self Collision Avoidance for Humanoids by means of Nullspace Criteria and Task Intervals. 2006 6th IEEE-RAS International Conference on Humanoid Robots. doi:10.1109/ichr.2006.321331 es_ES
dc.description.references Towell, C., Howard, M., & Vijayakumar, S. (2010). Learning nullspace policies. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. doi:10.1109/iros.2010.5650663 es_ES
dc.description.references Yoshikawa, T. (1985). Manipulability of Robotic Mechanisms. The International Journal of Robotics Research, 4(2), 3-9. doi:10.1177/027836498500400201 es_ES
dc.description.references Zhang, X.-D. (2017). Matrix Analysis and Applications. doi:10.1017/9781108277587 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem