New prioritized value iteration for Markov decision processes

García Hernández, Ma de Guadalupe; Ruiz Pinales, José; Onaindia de la Rivaherrera, Eva; Aviña Cervantes, J. Gabriel; Ledesma Orozco, Sergio; Alvarado Mendez, Edgar; Reyes Ballesteros, Alberto

doi:10.1007/s10462-011-9224-z

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

New prioritized value iteration for Markov decision processes

Mostrar el registro completo del ítem

García Hernández, MDG.; Ruiz Pinales, J.; Onaindia De La Rivaherrera, E.; Aviña Cervantes, JG.; Ledesma Orozco, S.; Alvarado Mendez, E.; Reyes Ballesteros, A. (2012). New prioritized value iteration for Markov decision processes. Artificial Intelligence Review. 37(2):157-167. doi:10.1007/s10462-011-9224-z

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/34793

Ficheros en el ítem

Nombre: AI Rev_Garcia-Her ...

Tamaño: 200.4Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir/Preview

Nombre: Mª GUADALUPE ...

Tamaño: 409.1Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

Metadatos del ítem

Título:

New prioritized value iteration for Markov decision processes

Autor:

García Hernández, Ma de Guadalupe Ruiz Pinales, José

Onaindia de la Rivaherrera, Eva Aviña Cervantes, J. Gabriel Ledesma Orozco, Sergio Alvarado Mendez, Edgar Reyes Ballesteros, Alberto

Entidad UPV:

Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació

Fecha difusión:

2012-02

Resumen:

The problem of solving large Markov decision processes accurately and quickly is challenging. Since the computational effort incurred is considerable, current research focuses on finding superior acceleration techniques. ...[+]

Palabras clave:

Dijkstra's algorithm , Markov decision processes , Prioritized value iteration

Derechos de uso:

Reserva de todos los derechos

Fuente:

Artificial Intelligence Review. (issn: 0269-2821 )

DOI:

10.1007/s10462-011-9224-z

Editorial:

Springer Verlag

Versión del editor:

http://link.springer.com/article/10.1007%2Fs10462-011-9224-z

Tipo:

Artículo

References

Agrawal S, Roth D (2002) Learning a sparse representation for object detection. In: Proceedings of the 7th European conference on computer vision. Copenhagen, Denmark, pp 1–15

Bellman RE (1954) The theory of dynamic programming. Bull Amer Math Soc 60: 503–516

Bellman RE (1957) Dynamic programming. Princeton University Press, New Jersey

Bertsekas DP (1995) Dynamic programming and optimal control. Athena Scientific, Massachusetts

Bhuma K, Goldsmith J (2003) Bidirectional LAO* algorithm. In: Proceedings of indian international conferences on artificial intelligence. p 980–992

Blackwell D (1965) Discounted dynamic programming. Ann Math Stat 36: 226–235

Bonet B, Geffner H (2003a) Faster heuristic search algorithms for planning with uncertainty and full feedback. In: Proceedings of the 18th international joint conference on artificial intelligence. Morgan Kaufmann, Acapulco, México, pp 1233–1238

Bonet B, Geffner H (2003b) Labeled RTDP: improving the convergence of real-time dynamic programming. In: Proceedings of the international conference on automated planning and scheduling. Trento, Italy, pp 12–21

Bonet B, Geffner H (2006) Learning depth-first search: a unified approach to heuristic search in deterministic and non-deterministic settings and its application to MDP. In: Proceedings of the 16th international conference on automated planning and scheduling. Cumbria, UK

Boutilier C, Dean T, Hanks S (1999) Decision-theoretic planning: structural assumptions and computational leverage. J Artif Intell Res 11: 1–94

Chang I, Soo H (2007) Simulation-based algorithms for Markov decision processes Communications and control engineering. Springer, London

Dai P, Goldsmith J (2007a) Faster dynamic programming for Markov decision processes. Technical report. Doctoral consortium, department of computer science and engineering. University of Washington

Dai P, Goldsmith J (2007b) Topological value iteration algorithm for Markov decision processes. In: Proceedings of the 20th international joint conference on artificial intelligence. Hyderabad, India, pp 1860–1865

Dai P, Hansen EA (2007c) Prioritizing bellman backups without a priority queue. In: Proceedings of the 17th international conference on automated planning and scheduling, association for the advancement of artificial intelligence. Rhode Island, USA, pp 113–119

Dibangoye JS, Chaib-draa B, Mouaddib A (2008) A Novel prioritization technique for solving Markov decision processes. In: Proceedings of the 21st international FLAIRS (The Florida Artificial Intelligence Research Society) conference, association for the advancement of artificial intelligence. Florida, USA

Ferguson D, Stentz A (2004) Focused propagation of MDPs for path planning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence. pp 310–317

Hansen EA, Zilberstein S (2001) LAO: a heuristic search algorithm that finds solutions with loops. Artif Intell 129: 35–62

Hinderer K, Waldmann KH (2003) The critical discount factor for finite Markovian decision processes with an absorbing set. Math Methods Oper Res 57: 1–19

Li L (2009) A unifying framework for computational reinforcement learning theory. PhD Thesis. The state university of New Jersey, New Brunswick. NJ

Littman ML, Dean TL, Kaelbling LP (1995) On the complexity of solving Markov decision problems.In: Proceedings of the 11th international conference on uncertainty in artificial intelligence. Montreal, Quebec pp 394–402

McMahan HB, Gordon G (2005a) Fast exact planning in Markov decision processes. In: Proceedings of the 15th international conference on automated planning and scheduling. Monterey, CA, USA

McMahan HB, Gordon G (2005b) Generalizing Dijkstra’s algorithm and gaussian elimination for solving MDPs. Technical report, Carnegie Mellon University, Pittsburgh

Meuleau N, Brafman R, Benazera E (2006) Stochastic over-subscription planning using hierarchies of MDPs. In: Proceedings of the 16th international conference on automated planning and scheduling. Cumbria, UK, pp 121–130

Moore A, Atkeson C (1993) Prioritized sweeping: reinforcement learning with less data and less real time. Mach Learn 13: 103–130

Puterman ML (1994) Markov decision processes. Wiley Editors, New York

Puterman ML (2005) Markov decision processes. Wiley Inter Science Editors, New York

Russell S (2005) Artificial intelligence: a modern approach. Making complex decisions (Ch-17), 2nd edn. Pearson Prentice Hill Ed., USA

Shani G, Brafman R, Shimony S (2008) Prioritizing point-based POMDP solvers. IEEE Trans Syst Man Cybern 38(6): 1592–1605

Sniedovich M (2006) Dijkstra’s algorithm revisited: the dynamic programming connexion. Control Cybern 35: 599–620

Sniedovich M (2010) Dynamic programming: foundations and principles, 2nd edn. Pure and Applied Mathematics Series, UK

Tijms HC (2003) A first course in stochastic models. Discrete-time Markov decision processes (Ch-6). Wiley Editors, UK

Vanderbei RJ (1996) Optimal sailing strategies. Statistics and operations research program, University of Princeton, USA ( http://www.orfe.princeton.edu/~rvdb/sail/sail.html )

Vanderbei RJ (2008) Linear programming: foundations and extensions, 3rd edn. Springer, New York

Wingate D, Seppi KD (2005) Prioritization methods for accelerating MDP solvers. J Mach Learn Res 6: 851–881

[-]

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48360]

Mostrar el registro completo del ítem

New prioritized value iteration for Markov decision processes

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

New prioritized value iteration for Markov decision processes

Ficheros en el ítem

Metadatos del ítem

References

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)