- -

The consequences of data dispersion in genomics: a comparative analysis of data sources for precision medicine

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

The consequences of data dispersion in genomics: a comparative analysis of data sources for precision medicine

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Costa-Sánchez, Mireia es_ES
dc.contributor.author García-Simón, Alberto es_ES
dc.contributor.author Pastor López, Oscar es_ES
dc.date.accessioned 2024-07-12T18:02:37Z
dc.date.available 2024-07-12T18:02:37Z
dc.date.issued 2023-11-09 es_ES
dc.identifier.uri http://hdl.handle.net/10251/206069
dc.description.abstract [EN] Background Genomics-based clinical diagnosis has emerged as a novel medical approach to improve diagnosis and treatment. However, advances in sequencing techniques have increased the generation of genomics data dramatically. This has led to several data management problems, one of which is data dispersion (i.e., genomics data is scattered across hundreds of data repositories). In this context, geneticists try to remediate the above-mentioned problem by limiting the scope of their work to a single data source they know and trust. This work has studied the consequences of focusing on a single data source rather than considering the many diferent existing genomics data sources. Methods The analysis is based on the data associated with two groups of disorders (i.e., oncology and cardiology) accessible from six well-known genomic data sources (i.e., ClinVar, Ensembl, GWAS Catalog, LOVD, CIViC, and CardioDB). Two dimensions have been considered in this analysis, namely, completeness and concordance. Completeness has been evaluated at two levels. First, by analyzing the information provided by each data source with regard to a conceptual schema data model (i.e., the schema level). Second, by analyzing the DNA variations provided by each data source as related to any of the disorders selected (i.e., the data level). Concordance has been evaluated by comparing the consensus among the data sources regarding the clinical relevance of each variation and disorder. Results The data sources with the highest completeness at the schema level are ClinVar, Ensembl, and CIViC. ClinVar has the highest completeness at the data level data source for the oncology and cardiology disorders. However, there are clinically relevant variations that are exclusive to other data sources, and they must be considered in order to provide the best clinical diagnosis. Although the information available in the data sources is predominantly concordant, discordance among the analyzed data exist. This can lead to inaccurate diagnoses. Conclusion Precision medicine analyses using a single genomics data source leads to incomplete results. Also, there are concordance problems that threaten the correctness of the genomics-based diagnosis results. es_ES
dc.description.sponsorship This work was supported by the Valencian Innovation Agency and Innovation through the OGMIOS project (INNEST/2021/57), the Generalitat Valenciana through the CoMoDiD project (CIPROM/2021/023), and the GVA-Predoctoral Research Grant (ACIF/2021/117), and the Spanish State Research Agency through the DELFOS (PDC2021-121243-I00) and SREC (PID2021-123824OBI00) projects, MICIN/AEI/10.13039/501 100011033 and co-fnanced with ERDF and the European Union Next Generation EU/PRTR. es_ES
dc.language Inglés es_ES
dc.publisher BioMed Central es_ES
dc.relation.ispartof BMC Medical Informatics and Decision Making es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Precision medicine es_ES
dc.subject DNA variations es_ES
dc.subject Concordance es_ES
dc.subject Completeness es_ES
dc.subject Genomic data sources es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title The consequences of data dispersion in genomics: a comparative analysis of data sources for precision medicine es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1186/s12911-023-02342-w es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PDC2021-121243-I00/ES/PLATAFORMA DELFOS: SISTEMA DE INFORMACION PARA LA GESTION DE VARIACIONES GENOMICAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-123824OB-I00/ES/DESARROLLO AGIL DE SISTEMAS DESDE REQUISITOS A CODIGO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//CIPROM%2F2021%2F023/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AVI//INNEST%2F2021%2F57/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/CIUCSD//ACIF%2F2021%2F117/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Costa-Sánchez, M.; García-Simón, A.; Pastor López, O. (2023). The consequences of data dispersion in genomics: a comparative analysis of data sources for precision medicine. BMC Medical Informatics and Decision Making. 23. https://doi.org/10.1186/s12911-023-02342-w es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1186/s12911-023-02342-w es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 23 es_ES
dc.identifier.eissn 1472-6947 es_ES
dc.identifier.pmid 37946154 es_ES
dc.identifier.pmcid PMC10636939 es_ES
dc.relation.pasarela S\503256 es_ES
dc.contributor.funder Generalitat Valenciana es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Agència Valenciana de la Innovació es_ES
dc.contributor.funder Conselleria de Innovación, Universidades, Ciencia y Sociedad Digital, Generalitat Valenciana es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem