Reinforcement learning applied to production planning and control

Esteso, Ana; Peidro Payá, David; Mula, Josefa; Díaz-Madroñero Boluda, Francisco Manuel

doi:10.1080/00207543.2022.2104180

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reinforcement learning applied to production planning and control

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: EstesoPeidroMula - ...

Tamaño: 3.034Mb

Formato: PDF

Descripción: Versión editorial

Abrir

dc.contributor.author	Esteso, Ana	es_ES
dc.contributor.author	Peidro Payá, David	es_ES
dc.contributor.author	Mula, Josefa	es_ES
dc.contributor.author	Díaz-Madroñero Boluda, Francisco Manuel	es_ES
dc.date.accessioned	2023-09-21T18:06:26Z
dc.date.available	2023-09-21T18:06:26Z
dc.date.issued	2023-08-18	es_ES
dc.identifier.issn	0020-7543	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/196934
dc.description.abstract	[EN] The objective of this paper is to examine the use and applications of reinforcement learning (RL) techniques in the production planning and control (PPC) field addressing the following PPC areas: facility resource planning, capacity planning, purchase and supply management, production scheduling and inventory management. The main RL characteristics, such as method, context, states, actions, reward and highlights, were analysed. The considered number of agents, applications and RL software tools, specifically, programming language, platforms, application programming interfaces and RL frameworks, among others, were identified, and 181 articles were sreviewed. The results showed that RL was applied mainly to production scheduling problems, followed by purchase and supply management. The most revised RL algorithms were model-free and single-agent and were applied to simplified PPC environments. Nevertheless, their results seem to be promising compared to traditional mathematical programming and heuristics/metaheuristics solution methods, and even more so when they incorporate uncertainty or non-linear properties. Finally, RL value-based approaches are the most widely used, specifically Q-learning and its variants and for deep RL, deep Q-networks. In recent years however, the most widely used approach has been the actor-critic method, such as the advantage actor critic, proximal policy optimisation, deep deterministic policy gradient and trust region policy optimisation.	es_ES
dc.description.sponsorship	The funding for the research work that has led to the obtained results came from the following grants: CADS4.0 (Ref. RTI2018-101344-B-I00) and NIOTOME (Ref. RTI2018102020-B-I00), financed byMCIN/AEI/10.13039/501100011033 and 'ERDF A way of making DEurope'; the EU H2020 research and innovation programme with grant numbers 825631 'Zero-Defect Manufacturing Platform (ZDMP)' and 958205 'Industrial Data Services for Quality Control in SmartManufacturing (i4Q)'; 'Industrial Production and Logistics Optimization in Industry 4.0' (i4OPT) (Ref. PROMETEO/2021/065) and 'Resilient, Sustainable and PeopleOriented Supply Chain 5.0 Optimization Using Hybrid Intelligence' (RESPECT) (Ref. CIGE/2021/159) Projects were funded by the Generalitat Valenciana (Valencian Regional Government).	es_ES
dc.language	Inglés	es_ES
dc.publisher	Taylor & Francis	es_ES
dc.relation.ispartof	International Journal of Production Research	es_ES
dc.rights	Reconocimiento - No comercial - Sin obra derivada (by-nc-nd)	es_ES
dc.subject	Artificial intelligence	es_ES
dc.subject	Machine learning	es_ES
dc.subject	Reinforcement learning	es_ES
dc.subject	Deep reinforcement learning	es_ES
dc.subject	Production planning and control	es_ES
dc.subject	Industry 4.0	es_ES
dc.subject.classification	ORGANIZACION DE EMPRESAS	es_ES
dc.title	Reinforcement learning applied to production planning and control	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1080/00207543.2022.2104180	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-101344-B-I00/ES/OPTIMIZACION DE TECNOLOGIAS DE PRODUCCION CERO-DEFECTOS HABILITADORAS PARA CADENAS DE SUMINISTRO 4.0/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//PROMETEO%2F2021%2F065//Industrial Production and Logistics Optimization in Industry 4.0 (i4OPT) /	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-102020-B-I00/ES/INTEGRACION DE LA TOMA DE DECISIONES DE LOS NIVELES TACTICO-OPERATIVO PARA LA MEJORA DE LA EFICIENCIA DEL SISTEMA DE PRODUCTIVO EN ENTORNOS INDUSTRIA 4.0/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//CIGE%2F2021%2F159//Optimización de cadenas de suministro 5.0 resilientes, sostenibles y orientadas a personas mediante inteligencia híbrida/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/825631/EU	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/958205/EU	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros Industriales - Escola Tècnica Superior d'Enginyers Industrials	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi	es_ES
dc.description.bibliographicCitation	Esteso, A.; Peidro Payá, D.; Mula, J.; Díaz-Madroñero Boluda, FM. (2023). Reinforcement learning applied to production planning and control. International Journal of Production Research. 61(16):5772-5789. https://doi.org/10.1080/00207543.2022.2104180	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1080/00207543.2022.2104180	es_ES
dc.description.upvformatpinicio	5772	es_ES
dc.description.upvformatpfin	5789	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	61	es_ES
dc.description.issue	16	es_ES
dc.relation.pasarela	S\472042	es_ES
dc.contributor.funder	GENERALITAT VALENCIANA	es_ES
dc.contributor.funder	AGENCIA ESTATAL DE INVESTIGACION	es_ES
dc.contributor.funder	European Regional Development Fund	es_ES
dc.contributor.funder	COMISION DE LAS COMUNIDADES EUROPEA	es_ES
dc.subject.ods	09.- Desarrollar infraestructuras resilientes, promover la industrialización inclusiva y sostenible, y fomentar la innovación	es_ES
upv.costeAPC	3085,5	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Reinforcement learning applied to production planning and control

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reinforcement learning applied to production planning and control

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)