- -

Research community dynamics behind popular AI benchmarks

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Research community dynamics behind popular AI benchmarks

Mostrar el registro completo del ítem

Martínez-Plumed, F.; Barredo, P.; Ó Héigeartaigh, S.; Hernández-Orallo, J. (2021). Research community dynamics behind popular AI benchmarks. Nature Machine Intelligence. 3(7):581-589. https://doi.org/10.1038/s42256-021-00339-6

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/182155

Ficheros en el ítem

Metadatos del ítem

Título: Research community dynamics behind popular AI benchmarks
Autor: Martínez-Plumed, Fernando Barredo, Pablo Ó HÉigeartaigh, Seán Hernández-Orallo, José
Entidad UPV: Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
Fecha difusión:
Resumen:
[EN] The widespread use of experimental benchmarks in AI research has created competition and collaboration dynamics that are still poorly understood. Here we provide an innovative methodology to explore these dynamics and ...[+]
Derechos de uso: Reserva de todos los derechos
Fuente:
Nature Machine Intelligence. (eissn: 2522-5839 )
DOI: 10.1038/s42256-021-00339-6
Editorial:
Nature Publishing Group
Versión del editor: https://doi.org/10.1038/s42256-021-00339-6
Código del Proyecto:
info:eu-repo/grantAgreement/EC/H2020/952215/EU
info:eu-repo/grantAgreement/GVA//PROMETEO%2F2019%2F098//DEEPTRUST/
Agradecimientos:
F.M.-P. acknowledges funding from the AI-Watch project by DG CONNECT and DG JRC of the European Commission. J.H.-O. and S.O.h. were funded by the Future of Life Institute, FLI, under grant RFP2-152. J.H.-O. was supported ...[+]
Tipo: Artículo

References

Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).

Wu, L., Wang, D. & Evans, J. A. Large teams develop and small teams disrupt science and technology. Nature 566, 378–382 (2019).

Frank, M. R., Wang, D., Cebrian, M. & Rahwan, I. The evolution of citation graphs in artificial intelligence research. Nat. Mach. Intell. 1, 79–85 (2019). [+]
Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).

Wu, L., Wang, D. & Evans, J. A. Large teams develop and small teams disrupt science and technology. Nature 566, 378–382 (2019).

Frank, M. R., Wang, D., Cebrian, M. & Rahwan, I. The evolution of citation graphs in artificial intelligence research. Nat. Mach. Intell. 1, 79–85 (2019).

Martínez-Plumed, F. et al. Accounting for the neglected dimensions of AI progress. Preprint at https://arxiv.org/abs/1806.00610 (2018).

Perrault, R. et al. The AI Index 2019 Annual Report (AI Index Steering Committee, Human-Centered AI Institute, Stanford Univ. 2019); https://hai.stanford.edu/ai-index-2019

Clauset, A., Newman, M. E. J. & Moore, C. Finding community structure in very large networks. Phys. Rev. E 70, 66–111 (2004).

Van Raan, A. The influence of international collaboration on the impact of research results: some simple mathematical considerations concerning the role of self-citations. Scientometrics 42, 423–428 (1998).

Deng, J. et al. ImageNet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–55 (IEEE, 2009).

Rajpurkar, P., Zhang, J., Lopyrev, K. & Liang, P. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2383–2392 (Association for Computational Linguistics, 2016).

Bonferroni, C. Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze 8, 3–62 (1936).

Kwok, R. Junior AI researchers are in demand by universities and industry. Nature 568, 581–584 (2019).

Rhoades, S. A. The Herfindahl–Hirschman index. Fed. Res. Bull. 79, 188–189 (1993).

Cave, S. & Ó hÉigeartaigh, S. S. An AI race for strategic advantage: rhetoric and risks. In Proc. 2018 AAAI/ACM Conference on AI, Ethics, and Society 36–40 (Association for Computing Machinery, 2018).

Lee, K.-F. AI Superpowers: China, Silicon Valley, and the New World Order (Houghton Mifflin Harcourt, 2018).

Horowitz, M. C., Allen, G. C., Kania, E. B. & Scharre, P. Strategic Competition in an Era of Artificial Intelligence 8 (Center for New American Security, 2018).

Li, W. C., Nirei, M. & Yamana, K. Value of Data: There’s No Such Thing as a Free Lunch in the Digital Economy Working Paper (US Bureau of Economic Analysis, 2019).

Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (2009).

Hernández-Orallo, J. et al. A new AI evaluation cosmos: Ready to play the game? AI Magazine 38, 66–69 (2017).

Shoham, Y. Towards the AI index. AI Magazine 38, 71–77 (2017).

Niu, J., Tang, W., Xu, F., Zhou, X. & Song, Y. Global research on AI from 1990–2014: spatially-explicit bibliometric analysis. ISPRS Int. J. Geoinf. 5, 66 (2016).

Juan Mateos-Garcia, K. S., Klinger, J. & Winch, R. A Semantic Analysis of the Recent Evolution of AI Research. https://www.nesta.org.uk/report/semantic-analysis-recent-evolution-ai-research/ (NESTA, 2019).

Gao, F. et al. Bibliometric analysis on tendency and topics of artificial intelligence over last decade. Microsyst. Technol. 1–13 (2019).

Tran, B. X. et al. Global evolution of research in artificial intelligence in health and medicine: a bibliometric study. J. Clin. Med. 8, 360 (2019).

Tang, X., Li, X., Ding, Y., Song, M. & Bu, Y. The pace of artificial intelligence innovations: speed, talent, and trial-and-error. J. Inf. 14, 101094 (2020).

Qian, Y., Liu, Y. & Sheng, Q. Z. Understanding hierarchical structural evolution in a scientific discipline: a case study of artificial intelligence. J. Inf. 14, 101047 (2020).

Serenko, A. The development of an AI journal ranking based on the revealed preference approach. J. Inf. 4, 447–459 (2010).

Campbell, M., Hoane Jr, A. J. & Hsu, F.-h Deep Blue. Artif. Intell. 134, 57–83 (2002).

Ferrucci, D. A. Introduction to ‘This is Watson’. IBM J. Res. Dev. 56, 235–249 (2012).

Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

Schlangen, D. Language tasks and language games: on methodology in current natural language processing research. Preprint at https://arxiv.org/abs/1908.10747 (2019).

Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A. & Choi, Y. Hellaswag: can a machine really finish your sentence? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 4791–4800 (Association for Computational Linguistics, 2019).

Lei, Y. & Liu, Z. The development of artificial intelligence: a bibliometric analysis, 2007–2016. J. Physi. 1168, 022027 (2019).

Martínez-Plumed, F. et al. The facets of artificial intelligence: a framework to track the evolution of AI. In Proc. Twenty-Seventh International Joint Conference on Artificial Intelligence 5180–5187 (International Joint Conferences on Artificial Intelligence Organization, 2018).

Bhattacharya, J. & Packalen, M. Stagnation and Scientific Incentives Technical Report (National Bureau of Economic Research, 2020).

Houghton, B. et al. Guaranteeing reproducibility in deep learning competitions. Preprint at https://arxiv.org/abs/2005.06041 (2020).

Lucic, M., Kurach, K., Michalski, M., Gelly, S. & Bousquet, O. Are gans created equal? A large-scale study. Adv. Neural Inf. Process. Syst. 700–709 (2018).

Hernandez, D. & Brown, T. B. Measuring the algorithmic efficiency of neural networks. Preprint at https://arxiv.org/abs/2005.04305 (2020).

Mattson, P. et al. MLPerf training benchmark. Preprint https://arxiv.org/abs/1910.01500 (2019).

Martínez-Plumed, F. & Hernández-Orallo, J. Dual indicators to analyse AI benchmarks: difficulty, discrimination, ability, and generality. IEEE Trans. Games 12, 121–131 (2020).

Martínez-Plumed, F., Barredo, P., hÉigeartaigh, S. Ó. & Hernández-Orallo, J. AI research dynamics. GitHub https://github.com/nandomp/AI_Research_Dynamics (2021).

Kuehne, H., Jhuang, H., Garrote, E., Poggio, T. & Serre, T. HMDB: a large video database for human motion recognition. In 2011 International Conference on Computer Vision 2556–2563 (IEEE, 2011).

Soomro, K., Zamir, A. R. & Shah, M. UCF101: a dataset of 101 human actions classes from videos in the wild. Preprint at https://arxiv.org/abs/1212.0402 (2012).

Bellemare, M. G., Naddaf, Y., Veness, J. & Bowling, M. The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013).

Timofte, R., De Smet, V. & Van Gool, L. Anchored neighborhood regression for fast example-based super-resolution. In Proc. IEEE International Conference on Computer Vision 1920–1927 (IEEE, 2013).

Hutter, M. Human knowledge compression contest. Hutter Prize http://prize.hutter1.net/ (2006).

Mikolov, T., Deoras, A., Kombrink, S., Burget, L. & Černocky, J. Empirical evaluation and combination of advanced language modeling techniques. In Twelfth Annual Conference of the International Speech Communication Association 605–608 (2011).

Dettmers, T., Minervini, P., Stenetorp, P. & Riedel, S. Convolutional 2D knowledge graph embeddings. In Proc. AAAI Conference on Artificial Intelligence Vol. 32 (2018).

Bojar, O. et al. Findings of the 2014 workshop on statistical machine translation. In Proc. Ninth Workshop on Statistical Machine Translation 12–58 (Association for Computational Linguistics, 2014); http://www.aclweb.org/anthology/W/W14/W14-3302

Sang, E. F. & De Meulder, F. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 142–147 (2003).

Weischedel, R. et al. Ontonotes Release 5.0 ldc2013t19 23 (Linguistic Data Consortium, 2013).

Lin, T.-Y. et al. Microsoft COCO: common objects in context. In European Conference on Computer Vision 740–755 (Springer, 2014).

Andriluka, M., Pishchulin, L., Gehler, P. & Schiele, B. 2D human pose estimation: new benchmark and state of the art analysis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3686–3693 (IEEE, 2014).

Yang, Y., Yih, W.-t. & Meek, C. Wikiqa: a challenge dataset for open-domain question answering. In Proc. 2015 Conference on Empirical Methods in Natural Language Processing 2013–2018 (Association for Computational Linguistics, 2015).

Cordts, M. et al. The cityscapes dataset for semantic urban scene understanding. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 3213–3223 (IEEE, 2016).

Everingham, M. et al. The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111, 98–136 (2015).

Maas, A. L. et al. Learning word vectors for sentiment analysis. In Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Vol. 1, 142–150 (Association for Computational Linguistics, 2011).

Socher, R. et al. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. 2013 Conference on Empirical Methods in Natural Language Processing 1631–1642 (Association for Computational Linguistics, 2013).

Panayotov, V., Chen, G., Povey, D. & Khudanpur, S. Librispeech: an ASR corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 5206–5210 (IEEE, 2015).

[-]

recommendations

 

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro completo del ítem