Mostrar el registro completo del ítem
Alarte, J.; Silva, J. (2021). Page-Level Main Content Extraction from Heterogeneous Webpages. ACM Transactions on Knowledge Discovery from Data. 15(6):1-21. https://doi.org/10.1145/3451168
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/181752
Título: | Page-Level Main Content Extraction from Heterogeneous Webpages | |
Autor: | Alarte, Julián | |
Entidad UPV: |
|
|
Fecha difusión: |
|
|
Resumen: |
[EN] The main content of a webpage is often surrounded by other boilerplate elements related to the template, such as menus, advertisements, copyright notices, and comments. For crawlers and indexers, isolating the main ...[+]
|
|
Palabras clave: |
|
|
Derechos de uso: | Reserva de todos los derechos | |
Fuente: |
|
|
DOI: |
|
|
Editorial: |
|
|
Versión del editor: | https://doi.org/10.1145/3451168 | |
Código del Proyecto: |
|
|
Agradecimientos: |
This work has been partially supported by the EU (FEDER) and the Spanish MCI/AEI under grants TIN2016-76843-C4-1-R and PID2019-104735RB-C41, by the Generalitat Valenciana under grant Prometeo/2019/098 (DeepTrust), and by ...[+]
|
|
Tipo: |
|