- -

Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Martín-Martín, Alberto es_ES
dc.contributor.author Thelwall, Mike es_ES
dc.contributor.author Orduña-Malea, Enrique es_ES
dc.contributor.author Delgado López-Cózar, Emilio es_ES
dc.date.accessioned 2023-04-19T18:01:20Z
dc.date.available 2023-04-19T18:01:20Z
dc.date.issued 2021-01 es_ES
dc.identifier.issn 0138-9130 es_ES
dc.identifier.uri http://hdl.handle.net/10251/192850
dc.description.abstract [EN] New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science Core Collection (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories. In response, this paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study. Google Scholar found 88% of all citations, many of which were not found by the other sources, and nearly all citations found by the remaining sources (89-94%). A similar pattern held within most subject categories. Microsoft Academic is the second largest overall (60% of all citations), including 82% of Scopus citations and 86% of WoS citations. In most categories, Microsoft Academic found more citations than Scopus and WoS (182 and 223 subject categories, respectively), but had coverage gaps in some areas, such as Physics and some Humanities categories. After Scopus, Dimensions is fourth largest (54% of all citations), including 84% of Scopus citations and 88% of WoS citations. It found more citations than Scopus in 36 categories, more than WoS in 185, and displays some coverage gaps, especially in the Humanities. Following WoS, COCI is the smallest, with 28% of all citations. Google Scholar is still the most comprehensive source. In many subject categories Microsoft Academic and Dimensions are good alternatives to Scopus and WoS in terms of coverage. es_ES
dc.description.sponsorship We thank Medialab UGR (Universidad de Granada) for providing funding to cover the cost of hosting the interactive web application54 created to explore the data used in this study. We thank Digital Science for providing free access to the Dimensions API. We thank Jing Xuan Xie for translating the abstract to Chinese. We thank Asura Enkhbayar for suggesting the use of an upset plot in Fig. 2. Lastly, we thank two anonymous reviewers for their thoughtful comments, which have helped improved the manuscript es_ES
dc.language Inglés es_ES
dc.publisher Springer-Verlag es_ES
dc.relation.ispartof Scientometrics es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Google Scholar es_ES
dc.subject Microsoft Academic es_ES
dc.subject Scopus es_ES
dc.subject Dimensions es_ES
dc.subject Web of Science es_ES
dc.subject OpenCitations es_ES
dc.subject COCI es_ES
dc.subject CrossRef es_ES
dc.subject Coverage es_ES
dc.subject Citations es_ES
dc.subject Bibliometrics es_ES
dc.subject Citation analysis es_ES
dc.subject Bibliographic databases es_ES
dc.subject Literature reviews es_ES
dc.subject.classification BIBLIOTECONOMIA Y DOCUMENTACION es_ES
dc.title Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s11192-020-03690-4 es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Facultad de Bellas Artes - Facultat de Belles Arts es_ES
dc.description.bibliographicCitation Martín-Martín, A.; Thelwall, M.; Orduña-Malea, E.; Delgado López-Cózar, E. (2021). Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations. Scientometrics. 126(1):871-906. https://doi.org/10.1007/s11192-020-03690-4 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1007/s11192-020-03690-4 es_ES
dc.description.upvformatpinicio 871 es_ES
dc.description.upvformatpfin 906 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 126 es_ES
dc.description.issue 1 es_ES
dc.relation.pasarela S\426132 es_ES
dc.contributor.funder Universidad de Granada es_ES
dc.description.references Baas, J., Schotten, M., Plume, A., Côté, G., & Karimi, R. (2020). Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quantitative Science Studies, 1(1), 377–386. https://doi.org/10.1162/qss_a_00019. es_ES
dc.description.references Beel, J., & Gipp, B. (2009a). Google Scholar’s ranking algorithm: The impact of articles’ age (an empirical study). Sixth International Conference on Information Technology: New Generations, 2009, 160–164. https://doi.org/10.1109/ITNG.2009.317. es_ES
dc.description.references Beel, J., & Gipp, B. (2009b). Google Scholar’s ranking algorithm: The impact of citation counts (An empirical study). Third International Conference on Research Challenges in Information Science, 2009, 439–446. https://doi.org/10.1109/RCIS.2009.5089308. es_ES
dc.description.references Beel, J., & Gipp, B. (2009c). Google Scholar’s ranking algorithm: An introductory overview. In Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI’09) (pp. 230–241). http://www.issi-society.org/proceedings/issi_2009/ISSI2009-proc-vol1_Aug2009_batch2-paper-1.pdf es_ES
dc.description.references Birkle, C., Pendlebury, D. A., Schnell, J., & Adams, J. (2020). Web of Science as a data source for research on scientific and scholarly activity. Quantitative Science Studies, 1(1), 363–376. https://doi.org/10.1162/qss_a_00018. es_ES
dc.description.references Chapman, K., & Ellinger, A. E. (2019). An evaluation of Web of Science, Scopus and Google Scholar citations in operations management. The International Journal of Logistics Management, 30(4), 1039–1053. https://doi.org/10.1108/IJLM-04-2019-0110. es_ES
dc.description.references Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7(3), 171–176. https://doi.org/10.1145/363958.363994. es_ES
dc.description.references Delgado López-Cózar, E., & Martín-Martín, A. (2018). Apagón digital de la producción científica española en Google Scholar. Anuario ThinkEPI, 12, 265–276. https://doi.org/10.3145/thinkepi.2018.40. es_ES
dc.description.references Delgado López-Cózar, E., Orduna-Malea, E., & Martín-Martín, A. (2019). Google Scholar as a data source for research assessment. In W. Glaenzel, H. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer handbook of science and technology indicators. Berlin: Springer. es_ES
dc.description.references Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., Stetsenko, P., Short, T., Lianoglou, S., Antonyan, E., Bonsch, M., & Parsonage, H. (2018). data.table: Extension of ‘data.frame’ (1.11.4). es_ES
dc.description.references Else, H. (2018, April 11). How I scraped data from Google Scholar. Nature. https://doi.org/10.1038/d41586-018-04190-5 es_ES
dc.description.references Forveille, T. (2019). A&A ranking by Google. Astronomy & Astrophysics, 628, E1. https://doi.org/10.1051/0004-6361/201936429. es_ES
dc.description.references Fraser, N., Brierley, L., Dey, G., Polka, J. K., Pálfy, M., & Coates, J. A. (2020). Preprinting a pandemic: The role of preprints in the COVID-19 pandemic. BioRxiv, 2020.05.22.111294. https://doi.org/10.1101/2020.05.22.111294 es_ES
dc.description.references Gusenbauer, M. (2018). Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics. https://doi.org/10.1007/s11192-018-2958-5. es_ES
dc.description.references Gusenbauer, M., & Haddaway, N. R. (2020). Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Research Synthesis Methods, 11(2), 181–217. https://doi.org/10.1002/jrsm.1378. es_ES
dc.description.references Haddaway, N., & Gusenbauer, M. (2020, February 3). A broken system: Why literature searching needs a FAIR revolution. Impact of Social Sciences. https://blogs.lse.ac.uk/impactofsocialsciences/2020/02/03/a-broken-system-why-literature-searching-needs-a-fair-revolution/. es_ES
dc.description.references Halevi, G., Moed, H., & Bar-Ilan, J. (2017). Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation—Review of the Literature. Journal of Informetrics, 11(3), 823–834. https://doi.org/10.1016/J.JOI.2017.06.005. es_ES
dc.description.references Harzing, A. W. (2016). Microsoft Academic (Search): A Phoenix arisen from the ashes? In Scientometrics (Vol. 108, No. 3, pp. 1637–1647). Springer, Netherlands. https://doi.org/10.1007/s11192-016-2026-y es_ES
dc.description.references Harzing, A.-W. (2016). Sacrifice a little accuracy for a lot more comprehensive coverage. Harzing.Com. https://harzing.com/blog/2016/08/sacrifice-a-little-accuracy-for-a-lot-more-comprehensive-coverage es_ES
dc.description.references Harzing, A. W. (2019). Two new kids on the block: How do Crossref and Dimensions compare with Google Scholar, Microsoft Academic, Scopus and the Web of Science? In Scientometrics (Vol. 120, Issue 1, pp. 341–349). Springer, Netherlands. https://doi.org/10.1007/s11192-019-03114-y es_ES
dc.description.references Harzing, A.-W., & Alakangas, S. (2016). Google Scholar, Scopus and the Web of Science: A longitudinal and cross-disciplinary comparison. Scientometrics, 106(2), 787–804. https://doi.org/10.1007/s11192-015-1798-9. es_ES
dc.description.references Harzing, A. W., & Alakangas, S. (2017a). Microsoft Academic: Is the phoenix getting wings? In Scientometrics (Vol. 110, Issue 1, pp. 371–383). Springer, Netherlands. https://doi.org/10.1007/s11192-016-2185-x es_ES
dc.description.references Harzing, A. W., & Alakangas, S. (2017b). Microsoft Academic is one year old: The Phoenix is ready to leave the nest. In Scientometrics (Vol. 112, Issue 3, pp. 1887–1894). Springer, Netherlands. https://doi.org/10.1007/s11192-017-2454-3 es_ES
dc.description.references Haunschild, R., Hug, S. E., Brändle, M. P., & Bornmann, L. (2018). The number of linked references of publications in Microsoft Academic in comparison with the Web of Science. In Scientometrics (Vol. 114, Issue 1, pp. 367–370). Springer, Netherlands. https://doi.org/10.1007/s11192-017-2567-8 es_ES
dc.description.references Heibi, I., Peroni, S., & Shotton, D. (2019). Software review: COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. Scientometrics. https://doi.org/10.1007/s11192-019-03217-6. es_ES
dc.description.references Hendricks, G., Tkaczyk, D., Lin, J., & Feeney, P. (2020). Crossref: The sustainable source of community-owned scholarly metadata. Quantitative Science Studies, 1(1), 414–427. https://doi.org/10.1162/qss_a_00022. es_ES
dc.description.references Herzog, C., Hook, D., & Konkiel, S. (2020). Dimensions: Bringing down barriers between scientometricians and data. Quantitative Science Studies, 1(1), 387–395. https://doi.org/10.1162/qss_a_00020. es_ES
dc.description.references Hook, D. W., Porter, S. J., & Herzog, C. (2018). Dimensions: Building Context for Search and Evaluation. Frontiers in Research Metrics and Analytics, 3, 23. https://doi.org/10.3389/frma.2018.00023. es_ES
dc.description.references Huang, C.-K., Neylon, C., Brookes-Kenworthy, C., Hosking, R., Montgomery, L., Wilson, K., et al. (2020). Comparison of bibliographic data sources: Implications for the robustness of university rankings. Quantitative Science Studies. https://doi.org/10.1162/qss_a_00031. es_ES
dc.description.references Hug, S. E., & Brändle, M. P. (2017). The coverage of Microsoft Academic: Analyzing the publication output of a university. Scientometrics, 113(3), 1551–1571. https://doi.org/10.1007/s11192-017-2535-3. es_ES
dc.description.references Kousha, K., & Thelwall, M. (2018). Can Microsoft Academic help to assess the citation impact of academic books? Journal of Informetrics, 12(3), 972–984. https://doi.org/10.1016/j.joi.2018.08.003. es_ES
dc.description.references Kousha, K., Thelwall, M., & Abdoli, M. (2018). Can Microsoft Academic assess the early citation impact of in-press articles? A multi-discipline exploratory analysis. Journal of Informetrics, 12(1), 287–298. https://doi.org/10.1016/j.joi.2018.01.009. es_ES
dc.description.references Krassowski, M. (2020). ComplexUpset. https://github.com/krassowski/complex-upset es_ES
dc.description.references Larsson, J., Godfrey, A. J. R., Kelley, T., Eberly, D. H., Gustafsson, P., & Huber, E. (2018). eulerr: Area-Proportional Euler and Venn Diagrams with Circles or Ellipses (4.1.0). es_ES
dc.description.references Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710. es_ES
dc.description.references Martín-Martín, A. (2018). Code to extract bibliographic data from Google Scholar (v1.0). Zenodo. https://doi.org/10.5281/zenodo.1481076 es_ES
dc.description.references Martín-Martín, A., & Delgado López-Cózar, E. (2016). Reading Web of Science data into R (0.6). es_ES
dc.description.references Martin-Martin, A., Orduna-Malea, E., Harzing, A.-W., & Delgado López-Cózar, E. (2017). Can we use Google Scholar to identify highly-cited documents? Journal of Informetrics, 11(1), 152–163. https://doi.org/10.1016/j.joi.2016.11.008. es_ES
dc.description.references Martín-Martín, A., Orduna-Malea, E., Thelwall, M., & Delgado López-Cózar, E. (2018). Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories. Journal of Informetrics, 12(4), 1160–1177. https://doi.org/10.1016/J.JOI.2018.09.002. es_ES
dc.description.references Moed, H. F., Bar-Ilan, J., & Halevi, G. (2016). A new methodology for comparing Google Scholar and Scopus. Journal of Informetrics, 10(2), 533–551. https://doi.org/10.1016/j.joi.2016.04.017. es_ES
dc.description.references Orduña-Malea, E., & Delgado-López-Cózar, E. (2018). Dimensions: Re-discovering the ecosystem of scientific information. Profesional de La Informacion, 27(2), 420–431. https://doi.org/10.3145/epi.2018.mar.21. es_ES
dc.description.references Orduña-Malea, E., Martín-Martín, A., Ayllon, M., & Delgado López-Cózar, E. (2014). The silent fading of an academic search engine: The case of Microsoft Academic Search. Online Information Review, 38(7), 936–953. https://doi.org/10.1108/OIR-07-2014-0169. es_ES
dc.description.references Orduña-Malea, E., Martín-Martín, A., Ayllón, J. M., & Delgado López-Cózar, E. (2016). La revolución Google Scholar: Destapando la caja de Pandora académica. Universidad de Granada y Unión de Editoriales Universitarias Españolas. es_ES
dc.description.references Orduna-Malea, E., Martín-Martín, A., & Delgado López-Cózar, E. (2017). Google Scholar as a source for scholarly evaluation: A bibliographic review of database errors. Revista Española de Documentación Científica, 40(4), e185. https://doi.org/10.3989/redc.2017.4.1500. es_ES
dc.description.references Orduna-Malea, E., Martín-Martín, A., & Delgado López-Cózar, E. (2018). Classic papers: Using Google Scholar to detect the highly-cited documents. In 23rd International conference on science and technology indicators (pp. 1298–1307). https://doi.org/10.31235/osf.io/zkh7p es_ES
dc.description.references Ortega, J. L. (2014). Academic search engines: A quantitative outlook. Cambridge: Chandos Publishing. es_ES
dc.description.references Peroni, S., & Shotton, D. (2020). OpenCitations, an infrastructure organization for open scholarship. Quantitative Science Studies, 1(1), 428–444. https://doi.org/10.1162/qss_a_00023. es_ES
dc.description.references R Core Team. (2014). R: A Language and Environment for Statistical Computing. es_ES
dc.description.references Rovira, C., Codina, L., Guerrero-Solé, F., & Lopezosa, C. (2019). Ranking by relevance and citation counts, a comparative study: Google Scholar, Microsoft academic, WoS and scopus. Future Internet, 11(9), 202. https://doi.org/10.3390/fi11090202. es_ES
dc.description.references Shotton, D. (2013). Publishing: Open citations. Nature, 502(7471), 295–297. https://doi.org/10.1038/502295a. es_ES
dc.description.references Shotton, D. (2018). Funders should mandate open citations. Nature, 553(7687), 129. https://doi.org/10.1038/d41586-018-00104-7. es_ES
dc.description.references Tay, A. (2019, April 3). 6 reasons why you should try Lens.org. Medium. https://medium.com/@aarontay/6-reasons-why-you-should-try-lens-org-c40abb09ec6f es_ES
dc.description.references Thelwall, M. (2017). Microsoft Academic: A multidisciplinary comparison of citation counts with Scopus and Mendeley for 29 journals. Journal of Informetrics, 11(4), 1201–1212. https://doi.org/10.1016/j.joi.2017.10.006. es_ES
dc.description.references Thelwall, M. (2018a). Does Microsoft Academic find early citations? Scientometrics, 114(1), 325–334. https://doi.org/10.1007/s11192-017-2558-9. es_ES
dc.description.references Thelwall, M. (2018b). Microsoft Academic automatic document searches: Accuracy for journal articles and suitability for citation analysis. Journal of Informetrics, 12(1), 1–9. https://doi.org/10.1016/j.joi.2017.11.001. es_ES
dc.description.references Thelwall, M. (2018c). Dimensions: A competitor to Scopus and the Web of Science? Journal of Informetrics, 12(2), 430–435. https://doi.org/10.1016/j.joi.2018.03.006. es_ES
dc.description.references van der Loo, M., van der Laan, J., R Core Team, Logan, N., & Muir, C. (2018). stringdist: Approximate String Matching and String Distance Functions (0.9.5.1). es_ES
dc.description.references van Eck, N. J., & Waltman, L. (2019). Accuracy of citation data in Web of Science and Scopus. es_ES
dc.description.references van Eck, N. J., Waltman, L., Larivière, V., & Sugimoto, C. (2018). Crossref as a new source of citation data: A comparison with Web of Science and Scopus. https://www.cwts.nl/blog?article=n-r2s234&title=crossref-as-a-new-source-of-citation-data-a-comparison-with-web-of-science-and-scopus es_ES
dc.description.references Van Noorden, R. (2014). November 7). Google Scholar pioneer on search engine’s future. Nature.. https://doi.org/10.1038/nature.2014.16269. es_ES
dc.description.references Visser, M., van Eck, N. J., & Waltman, L. (2020). Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. https://arxiv.org/abs/2005.10732 es_ES
dc.description.references Walker, A., & Braglia, L. (2018). openxlsx: Read, Write and Edit XLSX Files (4.1.0). es_ES
dc.description.references Wang, K., Shen, Z., Huang, C., Wu, C.-H., Dong, Y., & Kanakia, A. (2020). Microsoft academic graph: When experts are not enough. Quantitative Science Studies, 1(1), 396–413. https://doi.org/10.1162/qss_a_00021. es_ES
dc.description.references Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. New York: Springer. es_ES
dc.description.references Wilke, C. O. (2019). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2′. es_ES
dc.description.references Wu, J., Kim, K., & Giles, C. L. (2019). CiteSeerX: 20 years of service to scholarly big data. Proceedings of the Conference on Artificial Intelligence for Data Discovery and Reuse. https://doi.org/10.1145/3359115.3359119. es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem