- -

Linguistic challenges in automatic summarization technology

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Linguistic challenges in automatic summarization technology

Show simple item record

Files in this item

dc.contributor.author Diedrichsen, Elke es_ES
dc.date.accessioned 2017-07-07T07:15:19Z
dc.date.available 2017-07-07T07:15:19Z
dc.date.issued 2017-06-26
dc.identifier.uri http://hdl.handle.net/10251/84657
dc.description.abstract [EN] Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization. es_ES
dc.language Inglés es_ES
dc.publisher Universitat Politècnica de València
dc.relation.ispartof Journal of Computer-Assisted Linguistic Research
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Automatic summarization es_ES
dc.subject Natural language processing es_ES
dc.subject Llinguistics es_ES
dc.subject Syntax es_ES
dc.subject Discourse es_ES
dc.title Linguistic challenges in automatic summarization technology es_ES
dc.type Artículo es_ES
dc.date.updated 2017-07-07T07:00:34Z
dc.identifier.doi 10.4995/jclr.2017.7787
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Diedrichsen, E. (2017). Linguistic challenges in automatic summarization technology. Journal of Computer-Assisted Linguistic Research. 1(1):40-60. doi:10.4995/jclr.2017.7787. es_ES
dc.relation.publisherversion https://doi.org/10.4995/jclr.2017.7787 es_ES
dc.description.upvformatpinicio 40 es_ES
dc.description.upvformatpfin 60 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 1
dc.description.issue 1
dc.identifier.eissn 2530-9455


This item appears in the following Collection(s)

Show simple item record