Mostrar el registro sencillo del ítem
dc.contributor.author | De Paula, Mariano | es_ES |
dc.contributor.author | Ávila, Luis O. | es_ES |
dc.contributor.author | sánchez Reinoso, Carlos | es_ES |
dc.contributor.author | Acosta, Gerardo G. | es_ES |
dc.date.accessioned | 2020-05-19T10:04:47Z | |
dc.date.available | 2020-05-19T10:04:47Z | |
dc.date.issued | 2015-10-15 | |
dc.identifier.issn | 1697-7912 | |
dc.identifier.uri | http://hdl.handle.net/10251/143715 | |
dc.description.abstract | [ES] El control de sistemas complejos puede ser realizado descomponiendo la tarea de control en una secuencia de modos de control, o simplemente modos. Cada modo implementa una ley de retroalimentación hasta que se activa una condición de terminación, en respuesta a la ocurrencia de un evento exógeno/endógeno que indica que la ejecución del modo debe finalizar. En este trabajo se presenta una propuesta novedosa para encontrar una política de conmutación óptima para resolver el problema de control optimizando alguna medida de costo/beneficio. Una política óptima implementa un programa de control multimodal óptimo, el cual consiste en un encadenamiento de modos de control. La propuesta realizada incluye el desarrollo y formulación de un algoritmo basado en la idea de la programación dinámica integrando procesos Gaussianos y aprendizaje Bayesiano activo. Mediante el enfoque propuesto es posible realizar un uso eficiente de los datos para mejorar la exploración de las soluciones sobre espacios de estados continuos. Un caso de estudio representativo es abordado para demostrar el desempeño del algoritmo propuesto. | es_ES |
dc.description.abstract | [EN] The control of complex systems can be done decomposing the control task into a sequence of control modes, or modes for short. Each mode implements a parameterized feedback law until a termination condition is activated in response to the occurrence of an exogenous/endogenous event, which indicates that the execution mode must end. This paper presents a novel approach to find an optimal switching policy to solve a control problem by optimizing some measure of cost/benefit. An optimal policy implements an optimal multimodal control program, consisting in a sequence of control modes. The proposal includes the development of an algorithm based on the idea of dynamic programming integrating Gaussian processes and Bayesian active learning. In addition, an efficient use of the data to improve the exploration of the continuous state spaces solutions can be achieved through this approach. A representative case study is discussed and analyzed to demonstrate the performance of the proposed algorithm. | es_ES |
dc.language | Español | es_ES |
dc.publisher | Universitat Politècnica de València | es_ES |
dc.relation.ispartof | Revista Iberoamericana de Automática e Informática industrial | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Multimodal Control | es_ES |
dc.subject | Dynamic Programming | es_ES |
dc.subject | Gaussian Processes | es_ES |
dc.subject | Uncertainty | es_ES |
dc.subject | Policy | es_ES |
dc.subject | Control multimodal | es_ES |
dc.subject | Programación dinámica | es_ES |
dc.subject | Procesos Gaussianos | es_ES |
dc.subject | Incertidumbre | es_ES |
dc.subject | Política | es_ES |
dc.title | Control Multimodal en Entornos Inciertos usando Aprendizaje por Refuerzos y Procesos Gaussianos | es_ES |
dc.title.alternative | Multimodal Control in Uncertain Environments using Reinforcement Learning and Gaussian Processes | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.riai.2015.09.004 | |
dc.rights.accessRights | Abierto | es_ES |
dc.description.bibliographicCitation | De Paula, M.; Ávila, LO.; Sánchez Reinoso, C.; Acosta, GG. (2015). Control Multimodal en Entornos Inciertos usando Aprendizaje por Refuerzos y Procesos Gaussianos. Revista Iberoamericana de Automática e Informática industrial. 12(4):385-396. https://doi.org/10.1016/j.riai.2015.09.004 | es_ES |
dc.description.accrualMethod | OJS | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.riai.2015.09.004 | es_ES |
dc.description.upvformatpinicio | 385 | es_ES |
dc.description.upvformatpfin | 396 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 12 | es_ES |
dc.description.issue | 4 | es_ES |
dc.identifier.eissn | 1697-7920 | |
dc.relation.pasarela | OJS\9340 | es_ES |
dc.description.references | Abate, A., Prandini, M., Lygeros, J., & Sastry, S. (2008). Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica, 44(11), 2724-2734. doi:10.1016/j.automatica.2008.03.027 | es_ES |
dc.description.references | Adamek, F., M Sobotka, O Stursberg. 2008. Stochastic optimal control for hybrid systems with uncertain discrete dynamics. Proceedings of the IEEE International Conference on Automation Science and Engineering, 23-28. Washington D.C. | es_ES |
dc.description.references | Åström, Karl Johan, Bo Bernhardsson. 2003. System with Lebesgue Sampling. Directions in Mathematical Systems Theory and Optimization, LNCIS 268. LNCIS. Springer-Verlag Berlin Heidelberg. | es_ES |
dc.description.references | Axelsson, H., Wardi, Y., Egerstedt, M., & Verriest, E. I. (2007). Gradient Descent Approach to Optimal Mode Scheduling in Hybrid Dynamical Systems. Journal of Optimization Theory and Applications, 136(2), 167-186. doi:10.1007/s10957-007-9305-y | es_ES |
dc.description.references | Barton, P. I., Lee, C. K., & Yunt, M. (2006). Optimization of hybrid systems. Computers & Chemical Engineering, 30(10-12), 1576-1589. doi:10.1016/j.compchemeng.2006.05.024 | es_ES |
dc.description.references | Bemporad, A., & Di Cairano, S. (2011). Model-Predictive Control of Discrete Hybrid Stochastic Automata. IEEE Transactions on Automatic Control, 56(6), 1307-1321. doi:10.1109/tac.2010.2084810 | es_ES |
dc.description.references | Bemporad, A., & Morari, M. (1999). Control of systems integrating logic, dynamics, and constraints. Automatica, 35(3), 407-427. doi:10.1016/s0005-1098(98)00178-2 | es_ES |
dc.description.references | Bensoussan, A.,J. L. Menaldi. 2000. Stochastic hybrid control. Journal of Mathematical Analysis and Applications 249. | es_ES |
dc.description.references | Bertsekas, Dimitri P. 2000. Dynamic Programming and Optimal Control, Vol. I. 2nd ed. Athena Scientific. | es_ES |
dc.description.references | Blackmore, L., Ono, M., Bektassov, A., & Williams, B. C. (2010). A Probabilistic Particle-Control Approximation of Chance-Constrained Stochastic Predictive Control. IEEE Transactions on Robotics, 26(3), 502-517. doi:10.1109/tro.2010.2044948 | es_ES |
dc.description.references | Borrelli, F., Baotić, M., Bemporad, A., & Morari, M. (2005). Dynamic programming for constrained optimal control of discrete-time linear hybrid systems. Automatica, 41(10), 1709-1721. doi:10.1016/j.automatica.2005.04.017 | es_ES |
dc.description.references | Bryson, Jr Arthur E., Yu-Chi Ho. 1975. Applied optimal control: optimization, estimation and control. Revised. Taylor & Francis. | es_ES |
dc.description.references | Busoniu, Lucian, Robert Babuska, Bart De Schutter,Damien Ernst. 2010. Reinforcement learning and dynamic programming using function approximators. 1.a ed. CRC Press. | es_ES |
dc.description.references | Cassandras, Christos G., John Lygeros. 2007. Stochastic hybrid systems. Boca Raton: Taylor & Francis. | es_ES |
dc.description.references | Deisenroth, Marc Peter. 2010. Efficient Reinforcement Learning Using Gaussian Processes. KIT Scientific Publishing. | es_ES |
dc.description.references | Deisenroth, M. P., Rasmussen, C. E., & Peters, J. (2009). Gaussian process dynamic programming. Neurocomputing, 72(7-9), 1508-1524. doi:10.1016/j.neucom.2008.12.019 | es_ES |
dc.description.references | Di Cairano, S., Bemporad, A., & Júlvez, J. (2009). Event-driven optimization-based control of hybrid systems with integral continuous-time dynamics. Automatica, 45(5), 1243-1251. doi:10.1016/j.automatica.2008.12.011 | es_ES |
dc.description.references | Ding, X.-C., Wardi, Y., & Egerstedt, M. (2009). On-Line Optimization of Switched-Mode Dynamical Systems. IEEE Transactions on Automatic Control, 54(9), 2266-2271. doi:10.1109/tac.2009.2026864 | es_ES |
dc.description.references | Egerstedt, M., Wardi, Y., & Axelsson, H. (2006). Transition-Time Optimization for Switched-Mode Dynamical Systems. IEEE Transactions on Automatic Control, 51(1), 110-115. doi:10.1109/tac.2005.861711 | es_ES |
dc.description.references | Girard, Agathe. 2004. Approximate methods for propagation of uncertainty with gaussian process models. University of Glasgow. | es_ES |
dc.description.references | Kuss, M. 2006. Gaussian process models for robust regression, classification, and reinforcement learning. Technische Universite Darmstadt. | es_ES |
dc.description.references | Liberzon, Daniel. 2003. Switching in systems and control. Systems & Control: Foundations & Applications. Boston: Birkhäuser Boston Inc. | es_ES |
dc.description.references | Lincoln, B., & Rantzer, A. (2006). Relaxing Dynamic Programming. IEEE Transactions on Automatic Control, 51(8), 1249-1260. doi:10.1109/tac.2006.878720 | es_ES |
dc.description.references | Lunze, J., & Lehmann, D. (2010). A state-feedback approach to event-based control. Automatica, 46(1), 211-215. doi:10.1016/j.automatica.2009.10.035 | es_ES |
dc.description.references | Mehta, Tejas,Magnus Egerstedt. 2005. Learning multi-modal control programs. Hybrid Systems: Computation and Control, 466-479. Lecture Notes in Computer Science. Springer Berlin. | es_ES |
dc.description.references | Mehta, T. R., & Egerstedt, M. (2006). An optimal control approach to mode generation in hybrid systems. Nonlinear Analysis: Theory, Methods & Applications, 65(5), 963-983. doi:10.1016/j.na.2005.07.044 | es_ES |
dc.description.references | Mehta, T. R., & Egerstedt, M. (2008). Multi-modal control using adaptive motion description languages. Automatica, 44(7), 1912-1917. doi:10.1016/j.automatica.2007.11.024 | es_ES |
dc.description.references | Pajares Martin-Sanz, G., y De la Cruz Garcia J.M. 2010. Aprendizaje automático. Un enfoque práctico, Cap. 12, Aprendizaje por Refuerzos. RA-MA. | es_ES |
dc.description.references | Rantzer, A. (2006). Relaxed dynamic programming in switching systems. IEE Proceedings - Control Theory and Applications, 153(5), 567-574. doi:10.1049/ip-cta:20050094 | es_ES |
dc.description.references | Rasmussen, Carl Edward,Christopher K. I. Williams. 2006. Gaussian processes for machine learning. MIT Press. | es_ES |
dc.description.references | Rosenstein, Michael T.,Andrew G. Barto. 2004. Supervised Actor-Critic Reinforcement Learning. Handbook of Learning and Approximate Dynamic Programming, 359-380. John Wiley & Sons, Inc. | es_ES |
dc.description.references | Song, C., & Li, P. (2010). Near optimal control for a class of stochastic hybrid systems. Automatica, 46(9), 1553-1557. doi:10.1016/j.automatica.2010.05.024 | es_ES |
dc.description.references | Sutton, Richard S.,Andrew G. Barto. 1998. Reinforcement learning: An introduction. MIT Press. | es_ES |
dc.description.references | Verdinelli, I., & Kadane, J. B. (1992). Bayesian Designs for Maximizing Information and Outcome. Journal of the American Statistical Association, 87(418), 510-515. doi:10.1080/01621459.1992.10475233 | es_ES |
dc.description.references | Xu, Xuping,Panos J. Antsaklis. 2003. Results and perspectives on computational methods for optimal control of switched systems. Proceedings of the 6th international conference on Hybrid systems: computation and control, 540-555. Springer-Verlag. | es_ES |
dc.description.references | Xu, Y.-K., & Cao, X.-R. (2011). Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation. IEEE Transactions on Automatic Control, 56(5), 1097-1109. doi:10.1109/tac.2010.2073610 | es_ES |