Mostrar el registro sencillo del ítem
dc.contributor.author | Alarte, Julián | es_ES |
dc.contributor.author | Insa Cabrera, David | es_ES |
dc.contributor.author | Silva Galiana, Josep Francesc | es_ES |
dc.contributor.author | Tamarit Muñoz, Salvador | es_ES |
dc.date.accessioned | 2015-05-18T14:44:09Z | |
dc.date.available | 2015-05-18T14:44:09Z | |
dc.date.issued | 2015-01 | |
dc.identifier.issn | 2075-2180 | |
dc.identifier.uri | http://hdl.handle.net/10251/50403 | |
dc.description.abstract | [EN] Web templates are one of the main development resources for website engineers. Templates allow them to increase productivity by plugin content into already formatted and prepared pagelets. For the final user templates are also useful, because they provide uniformity and a common look and feel for all webpages. However, from the point of view of crawlers and indexers, templates are an important problem, because templates usually contain irrelevant information such as advertisements, menus, and banners. Processing and storing this information is likely to lead to a waste of resources (storage space, bandwidth, etc.). It has been measured that templates represent between 40% and 50% of data on the Web. Therefore, identifying templates is essential for indexing tasks. In this work we propose a novel method for automatic template extraction that is based on similarity analysis between the DOM trees of a collection of webpages that are detected using menus information. Our implementation and experiments demonstrate the usefulness of the technique. | es_ES |
dc.description.sponsorship | This work has been partially supported by the EU (FEDER) and the Spanish Ministerio de Economia y Competitividad (Secretaria de Estado de Investigacion, Desarrollo e Innovacion) under Grant TIN201344742-C4-1-R and by the Generalitat Valenciana under Grant PROMETEO/2011/052. David Insa was partially supported by the Spanish Ministerio de Educacion under FPU Grant AP2010-4415. Salvador Tamarit was partially supported by research project POLCA, Programming Large Scale Heterogeneous Infrastructures (610686), funded by the European Union, STREP FP7. | en_EN |
dc.language | Inglés | es_ES |
dc.relation.ispartof | Electronic Proceedings in Theoretical Computer Science | es_ES |
dc.rights | Reconocimiento (by) | es_ES |
dc.subject | Information Retrieval | es_ES |
dc.subject | Template Extraction | es_ES |
dc.subject | Content Extraction | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Web template extraction based on hyperlink analysis | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.4204/EPTCS.173.2 | |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/FP7/610686/EU/Programming Large Scale Heterogeneous Infrastructures/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2013-44742-C4-1-R/ES/VALIDACION ASISTIDA DE PROGRAMAS MEDIANTE METODOS PRECISOS Y RIGUROSOS PARA UNA INGENIERIA DEL SOFTWARE ROBUSTA/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//PROMETEO%2F2011%2F052/ES/LOGICEXTREME: TECNOLOGIA LOGICA Y SOFTWARE SEGURO/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Alarte, J.; Insa Cabrera, D.; Silva Galiana, JF.; Tamarit Muñoz, S. (2015). Web template extraction based on hyperlink analysis. Electronic Proceedings in Theoretical Computer Science. 173:16-26. https://doi.org/10.4204/EPTCS.173.2 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://dx.doi.org/10.4204/EPTCS.173.2 | es_ES |
dc.description.upvformatpinicio | 16 | es_ES |
dc.description.upvformatpfin | 26 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 173 | es_ES |
dc.relation.senia | 280504 | |
dc.contributor.funder | European Commission | |
dc.contributor.funder | Ministerio de Economía y Competitividad | |
dc.contributor.funder | Generalitat Valenciana |