- -

On environment difficulty and discriminating power

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

On environment difficulty and discriminating power

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author José Hernández-Orallo es_ES
dc.date.accessioned 2016-05-26T11:46:55Z
dc.date.available 2016-05-26T11:46:55Z
dc.date.issued 2015-05
dc.identifier.issn 1387-2532
dc.identifier.uri http://hdl.handle.net/10251/64792
dc.description The final publication is available at Springer via http://dx.doi.org/10.1007/s10458-014-9257-1 es_ES
dc.description.abstract This paper presents a way to estimate the difficulty and discriminating power of any task instance. We focus on a very general setting for tasks: interactive (possibly multiagent) environments where an agent acts upon observations and rewards. Instead of analysing the complexity of the environment, the state space or the actions that are performed by the agent, we analyse the performance of a population of agent policies against the task, leading to a distribution that is examined in terms of policy complexity. This distribution is then sliced by the algorithmic complexity of the policy and analysed through several diagrams and indicators. The notion of environment response curve is also introduced, by inverting the performance results into an ability scale. We apply all these concepts, diagrams and indicators to two illustrative problems: a class of agent-populated elementary cellular automata, showing how the difficulty and discriminating power may vary for several environments, and a multiagent system, where agents can become predators or preys, and may need to coordinate. Finally, we discuss how these tools can be applied to characterise (interactive) tasks and (multi-agent) environments. These characterisations can then be used to get more insight about agent performance and to facilitate the development of adaptive tests for the evaluation of agent abilities. es_ES
dc.description.sponsorship I thank the reviewers for their comments, especially those aiming at a clearer connection with the field of multi-agent systems and the suggestion of better approximations for the calculation of the response curves. The implementation of the elementary cellular automata used in the environments is based on the library 'CellularAutomaton' by John Hughes for R [58]. I am grateful to Fernando Soler-Toscano for letting me know about their work [65] on the complexity of 2D objects generated by elementary cellular automata. I would also like to thank David L. Dowe for his comments on a previous version of this paper. This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project PROMETEO/2008/051, the COST - European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economia y Competitividad in Spain (PCIN-2013-037). en_EN
dc.language Inglés es_ES
dc.publisher Springer Verlag (Germany) es_ES
dc.relation European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA) es_ES
dc.relation.ispartof Autonomous Agents and Multi-Agent Systems es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Environment difficulty es_ES
dc.subject Agent evaluation es_ES
dc.subject Discriminating power es_ES
dc.subject Agent policy es_ES
dc.subject Algorithmic information theory es_ES
dc.subject Universal psychometrics es_ES
dc.subject Reinforcement learning es_ES
dc.subject Elementary cellular automata es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title On environment difficulty and discriminating power es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s10458-014-9257-1
dc.relation.projectID info:eu-repo/grantAgreement/MEC//CSD2007-00022/ES/Agreement Technologies/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/COST//IC0801/EU/Agreement Technologies/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2010-21062-C02-02/ES/SWEETLOGICS-UPV/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO08%2F2008%2F051/ES/Advances on Agreement Technologies for Computational Entities (atforce)/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//PCIN-2013-037/ES/RETHINKING THE ESSENCE, FLEXIBILITY AND REUSABILITY OF ADVANCED MODEL EXPLOITATION/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation José Hernández-Orallo (2015). On environment difficulty and discriminating power. Autonomous Agents and Multi-Agent Systems. 29(3):402-454. https://doi.org/10.1007/s10458-014-9257-1 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://link.springer.com/article/10.1007/s10458-014-9257-1 es_ES
dc.description.upvformatpinicio 402 es_ES
dc.description.upvformatpfin 454 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 29 es_ES
dc.description.issue 3 es_ES
dc.relation.senia 302234 es_ES
dc.identifier.eissn 1573-7454
dc.contributor.funder Generalitat Valenciana es_ES
dc.contributor.funder European Cooperation in Science and Technology es_ES
dc.contributor.funder Ministerio de Educación y Ciencia es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.description.references Anderson, J., Baltes, J., & Cheng, C. T. (2011). Robotics competitions as benchmarks for ai research. The Knowledge Engineering Review, 26(01), 11–17. es_ES
dc.description.references Andre, D., & Russell, S. J. (2002). State abstraction for programmable reinforcement learning agents. In Proceedings of the National Conference on Artificial Intelligence (pp. 119–125). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999. es_ES
dc.description.references Antunes, L., Fortnow, L., van Melkebeek, D., & Vinodchandran, N. V. (2006). Computational depth: Concept and applications. Theoretical Computer Science, 354(3), 391–404. Foundations of Computation Theory (FCT 2003), 14th Symposium on Fundamentals of Computation Theory 2003. es_ES
dc.description.references Arai, K., Kaminka, G. A., Frank, I., & Tanaka-Ishii, K. (2003). Performance competitions as research infrastructure: Large scale comparative studies of multi-agent teams. Autonomous Agents and Multi-Agent Systems, 7(1–2), 121–144. es_ES
dc.description.references Ashcraft, M. H., Donley, R. D., Halas, M. A., & Vakali, M. (1992). Chapter 8 working memory, automaticity, and problem difficulty. In Jamie I.D. Campbell (Ed.), The nature and origins of mathematical skills, volume 91 of advances in psychology (pp. 301–329). North-Holland. es_ES
dc.description.references Ay, N., Müller, M., & Szkola, A. (2010). Effective complexity and its relation to logical depth. IEEE Transactions on Information Theory, 56(9), 4593–4607. es_ES
dc.description.references Barch, D. M., Braver, T. S., Nystrom, L. E., Forman, S. D., Noll, D. C., & Cohen, J. D. (1997). Dissociating working memory from task difficulty in human prefrontal cortex. Neuropsychologia, 35(10), 1373–1380. es_ES
dc.description.references Bordini, R. H., Hübner, J. F., & Wooldridge, M. (2007). Programming multi-agent systems in AgentSpeak using Jason. London: Wiley. com. es_ES
dc.description.references Boutilier, C., Reiter, R., Soutchanski, M., Thrun, S. et al. (2000). Decision-theoretic, high-level agent programming in the situation calculus. In Proceedings of the National Conference on Artificial Intelligence (pp. 355–362). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999. es_ES
dc.description.references Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(2), 156–172. es_ES
dc.description.references Chaitin, G. J. (1977). Algorithmic information theory. IBM Journal of Research and Development, 21, 350–359. es_ES
dc.description.references Chedid, F. B. (2010). Sophistication and logical depth revisited. In 2010 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) (pp. 1–4). IEEE. es_ES
dc.description.references Cheeseman, P., Kanefsky, B. & Taylor, W. M. (1991). Where the really hard problems are. In Proceedings of IJCAI-1991 (pp. 331–337). es_ES
dc.description.references Dastani, M. (2008). 2APL: A practical agent programming language. Autonomous Agents and Multi-agent Systems, 16(3), 214–248. es_ES
dc.description.references Delahaye, J. P. & Zenil, H. (2011). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation, 219(1), 63–77 es_ES
dc.description.references Dowe, D. L. (2008). Foreword re C. S. Wallace. Computer Journal, 51(5), 523–560. Christopher Stewart WALLACE (1933–2004) memorial special issue. es_ES
dc.description.references Dowe, D. L., & Hernández-Orallo, J. (2012). IQ tests are not for machines, yet. Intelligence, 40(2), 77–81. es_ES
dc.description.references Du, D. Z., & Ko, K. I. (2011). Theory of computational complexity (Vol. 58). London: Wiley-Interscience. es_ES
dc.description.references Elo, A. E. (1978). The rating of chessplayers, past and present (Vol. 3). London: Batsford. es_ES
dc.description.references Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum. es_ES
dc.description.references Fatès, N. & Chevrier, V. (2010). How important are updating schemes in multi-agent systems? an illustration on a multi-turmite model. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1 (pp. 533–540). International Foundation for Autonomous Agents and Multiagent Systems. es_ES
dc.description.references Ferber, J. & Müller, J. P. (1996). Influences and reaction: A model of situated multiagent systems. In Proceedings of Second International Conference on Multi-Agent Systems (ICMAS-96) (pp. 72–79). es_ES
dc.description.references Ferrando, P. J. (2009). Difficulty, discrimination, and information indices in the linear factor analysis model for continuous item responses. Applied Psychological Measurement, 33(1), 9–24. es_ES
dc.description.references Ferrando, P. J. (2012). Assessing the discriminating power of item and test scores in the linear factor-analysis model. Psicológica, 33, 111–139. es_ES
dc.description.references Gent, I. P., & Walsh, T. (1994). Easy problems are sometimes hard. Artificial Intelligence, 70(1), 335–345. es_ES
dc.description.references Gershenson, C. & Fernandez, N. (2012). Complexity and information: Measuring emergence, self-organization, and homeostasis at multiple scales. Complexity, 18(2), 29–44. es_ES
dc.description.references Gruner, S. (2010). Mobile agent systems and cellular automata. Autonomous Agents and Multi-agent Systems, 20(2), 198–233. es_ES
dc.description.references Hardman, D. K., & Payne, S. J. (1995). Problem difficulty and response format in syllogistic reasoning. The Quarterly Journal of Experimental Psychology, 48(4), 945–975. es_ES
dc.description.references He, J., Reeves, C., Witt, C., & Yao, X. (2007). A note on problem difficulty measures in black-box optimization: Classification, realizations and predictability. Evolutionary Computation, 15(4), 435–443. es_ES
dc.description.references Hernández-Orallo, J. (2000). Beyond the turing test. Journal of Logic Language & Information, 9(4), 447–466. es_ES
dc.description.references Hernández-Orallo, J. (2000). On the computational measurement of intelligence factors. In A. Meystel (Ed.), Performance metrics for intelligent systems workshop (pp. 1–8). Gaithersburg, MD: National Institute of Standards and Technology. es_ES
dc.description.references Hernández-Orallo, J. (2000). Thesis: Computational measures of information gain and reinforcement in inference processes. AI Communications, 13(1), 49–50. es_ES
dc.description.references Hernández-Orallo, J. (2010). A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In M. Hutter et al. (Ed.), 3rd International Conference on Artificial General Intelligence (pp. 182–183). Atlantis Press Extended report at http://users.dsic.upv.es/proy/anynt/unbiased.pdf . es_ES
dc.description.references Hernández-Orallo, J., & Dowe, D. L. (2010). Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence, 174(18), 1508–1539. es_ES
dc.description.references Hernández-Orallo, J., Dowe, D. L., España-Cubillo, S., Hernández-Lloreda, M. V., & Insa-Cabrera, J. (2011). On more realistic environment distributions for defining, evaluating and developing intelligence. In J. Schmidhuber, K. R. Thórisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 82–91). Berlin: Springer. es_ES
dc.description.references Hernández-Orallo, J., Dowe, D. L., & Hernández-Lloreda, M. V. (2014). Universal psychometrics: Measuring cognitive abilities in the machine kingdom. Cognitive Systems Research, 27, 50–74. es_ES
dc.description.references Hernández-Orallo, J., Insa, J., Dowe, D. L. & Hibbard, B. (2012). Turing tests with turing machines. In A. Voronkov (Ed.), The Alan Turing Centenary Conference, Turing-100, Manchester, 2012, volume 10 of EPiC Series (pp. 140–156). es_ES
dc.description.references Hernández-Orallo, J. & Minaya-Collado, N. (1998). A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In Proceedings of International Symposium of Engineering of Intelligent Systems (EIS’98) (pp. 146–163). ICSC Press. es_ES
dc.description.references Hibbard, B. (2009). Bias and no free lunch in formal measures of intelligence. Journal of Artificial General Intelligence, 1(1), 54–61. es_ES
dc.description.references Hoos, H. H. (1999). Sat-encodings, search space structure, and local search performance. In 1999 International Joint Conference on Artificial Intelligence (Vol. 16, pp. 296–303). es_ES
dc.description.references Insa-Cabrera, J., Benacloch-Ayuso, J. L., & Hernández-Orallo, J. (2012). On measuring social intelligence: Experiments on competition and cooperation. In J. Bach, B. Goertzel, & M. Iklé (Eds.), AGI, volume 7716 of lecture notes in computer science (pp. 126–135). Berlin: Springer. es_ES
dc.description.references Insa-Cabrera, J., Dowe, D. L., España-Cubillo, S., Hernández-Lloreda, M. V., & Hernández-Orallo, J. (2011). Comparing humans and AI agents. In J. Schmidhuber, K. R. Thórisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 122–132). Berlin: Springer. es_ES
dc.description.references Knuth, D. E. (1973). Sorting and searching, volume 3 of the art of computer programming. Reading, MA: Addison-Wesley. es_ES
dc.description.references Kotovsky, K., & Simon, H. A. (1990). What makes some problems really hard: Explorations in the problem space of difficulty. Cognitive Psychology, 22(2), 143–183. es_ES
dc.description.references Legg, S. (2008). Machine super intelligence. PhD thesis, Department of Informatics, University of Lugano, June 2008. es_ES
dc.description.references Legg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444. es_ES
dc.description.references Leonetti, M. & Iocchi, L. (2010). Improving the performance of complex agent plans through reinforcement learning. In Proceedings of the 2010 International Conference on Autonomous Agents and Multiagent Systems (Vol. 1, pp. 723–730). International Foundation for Autonomous Agents and Multiagent Systems. es_ES
dc.description.references Levin, L. A. (1973). Universal sequential search problems. Problems of Information Transmission, 9(3), 265–266. es_ES
dc.description.references Levin, L. A. (1986). Average case complete problems. SIAM Journal on Computing, 15, 285. es_ES
dc.description.references Li, M., & Vitányi, P. (2008). An introduction to Kolmogorov complexity and its applications (3rd ed.). Berlin: Springer. es_ES
dc.description.references Low, C. K., Chen, T. Y., & Rónnquist, R. (1999). Automated test case generation for bdi agents. Autonomous Agents and Multi-agent Systems, 2(4), 311–332. es_ES
dc.description.references Madden, M. G., & Howley, T. (2004). Transfer of experience between reinforcement learning environments with progressive difficulty. Artificial Intelligence Review, 21(3), 375–398. es_ES
dc.description.references Mellenbergh, G. J. (1994). Generalized linear item response theory. Psychological Bulletin, 115(2), 300. es_ES
dc.description.references Michel, F. (2004). Formalisme, outils et éléments méthodologiques pour la modélisation et la simulation multi-agents. PhD thesis, Université des sciences et techniques du Languedoc, Montpellier. es_ES
dc.description.references Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81. es_ES
dc.description.references Orponen, P., Ko, K. I., Schöning, U., & Watanabe, O. (1994). Instance complexity. Journal of the ACM (JACM), 41(1), 96–121. es_ES
dc.description.references Simon, H. A., & Kotovsky, K. (1963). Human acquisition of concepts for sequential patterns. Psychological Review, 70(6), 534. es_ES
dc.description.references Team, R., et al. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. es_ES
dc.description.references Whiteson, S., Tanner, B., & White, A. (2010). The reinforcement learning competitions. The AI Magazine, 31(2), 81–94. es_ES
dc.description.references Wiering, M., & van Otterlo, M. (Eds.). (2012). Reinforcement learning: State-of-the-art. Berlin: Springer. es_ES
dc.description.references Wolfram, S. (2002). A new kind of science. Champaign, IL: Wolfram Media. es_ES
dc.description.references Zatuchna, Z., & Bagnall, A. (2009). Learning mazes with aliasing states: An LCS algorithm with associative perception. Adaptive Behavior, 17(1), 28–57. es_ES
dc.description.references Zenil, H. (2010). Compression-based investigation of the dynamical properties of cellular automata and other systems. Complex Systems, 19(1), 1–28. es_ES
dc.description.references Zenil, H. (2011). Une approche expérimentale à la théorie algorithmique de la complexité. PhD thesis, Dissertation in fulfilment of the degree of Doctor in Computer Science, Université de Lille. es_ES
dc.description.references Zenil, H., Soler-Toscano, F., Delahaye, J. P. & Gauvrit, N. (2012). Two-dimensional kolmogorov complexity and validation of the coding theorem method by compressibility. arXiv, preprint arXiv:1212.6745 . es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem