- -

A Lightweight Statistical Method for Terminology Extraction

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

A Lightweight Statistical Method for Terminology Extraction

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Nazar, Rogelio es_ES
dc.contributor.author Acosta, Nicolás es_ES
dc.date.accessioned 2024-01-02T11:05:28Z
dc.date.available 2024-01-02T11:05:28Z
dc.date.issued 2023-12-12
dc.identifier.uri http://hdl.handle.net/10251/201317
dc.description.abstract [EN] We propose a method for the task of automatic terminology extraction in the context of a larger project devoted to the automation of part of the tasks involved in the production of terminological databases. Terminology extraction is the key to drafting the macrostructure of a terminological resource (i.e., the list of entries), to which information can be later added at the microstructural level with grammatical or semantic information. To this end, we developed a statistical method that is conceptually simple compared to modern neural network approaches. It is a lightweight method because it is based on term dispersion and co-occurrence statistics that can be computed with basic hardware. For the evaluation, we experimented with corpora of lexicography and linguistics in English and Spanish of ca. 66 million tokens. Results improve baselines in almost 20%. es_ES
dc.language Inglés es_ES
dc.publisher Universitat Politècnica de València es_ES
dc.relation.ispartof Journal of Computer-Assisted Linguistic Research es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Automatic terminology extraction es_ES
dc.subject Corpus-based terminology processing es_ES
dc.subject Computational terminology es_ES
dc.subject Information extraction es_ES
dc.subject Dispersion measures es_ES
dc.title A Lightweight Statistical Method for Terminology Extraction es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.4995/jclr.2023.20427
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Nazar, R.; Acosta, N. (2023). A Lightweight Statistical Method for Terminology Extraction. Journal of Computer-Assisted Linguistic Research. 7:43-59. https://doi.org/10.4995/jclr.2023.20427 es_ES
dc.description.accrualMethod OJS es_ES
dc.relation.publisherversion https://doi.org/10.4995/jclr.2023.20427 es_ES
dc.description.upvformatpinicio 43 es_ES
dc.description.upvformatpfin 59 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 7 es_ES
dc.identifier.eissn 2530-9455
dc.relation.pasarela OJS\20427 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem