Journal of Computer-Assisted Linguistic Research

Journal of Computer-Assisted Linguistic Research - Vol 01 (2017) https://riunet.upv.es:443/handle/10251/84624 2025-02-05T23:05:05Z 2025-02-05T23:05:05Z Automatically Representing TExt Meaning via an Interlingua-based System (ARTEMIS). A further step towards the computational representation of RRG Mairal-Usón, Ricardo Cortés-Rodríguez, Francisco https://riunet.upv.es:443/handle/10251/84659 2023-11-21T11:49:19Z 2017-07-07T07:19:03Z

Automatically Representing TExt Meaning via an Interlingua-based System (ARTEMIS). A further step towards the computational representation of RRG Mairal-Usón, Ricardo; Cortés-Rodríguez, Francisco [EN] Within the framework of FUNK Lab – a virtual laboratory for natural language processing inspired on a functionally-oriented linguistic theory like Role and Reference Grammar-, a number of computational resources have been built dealing with different aspects of language and with an application in different scientific domains, i.e. terminology, lexicography, sentiment analysis, document classification, text analysis, data mining etc. One of these resources is ARTEMIS (Automatically Representing TExt Meaning via an Interlingua-Based System), which departs from the pioneering work of Periñán-Pascual (2013) and Periñán-Pascual & Arcas (2014). This computational tool is a proof of concept prototype which allows the automatic generation of a conceptual logical structure (CLS) (cf. Mairal-Usón, Periñán-Pascual and Pérez 2012; Van Valin and Mairal-Usón 2014), that is, a fully specified semantic representation of an input text on the basis of a reduced sample of sentences. The primary aim of this paper is to develop the syntactic rules that form part of the computational grammar for the representation of simple clauses in English. More specifically, this work focuses on the format of those syntactic rules that account for the upper levels of the RRG Layered Structure of the Clause (LSC), that is, the core (and the level-1 construction associated with it), the clause and the sentence (Van Valin 2005). In essence, this analysis, together with that in Cortés-Rodríguez and Mairal-Usón (2016), offers an almost complete description of the computational grammar behind the LSC for simple clauses.

2017-07-07T07:19:03Z Linguistic challenges in automatic summarization technology Diedrichsen, Elke https://riunet.upv.es:443/handle/10251/84657 2023-11-21T11:49:19Z 2017-07-07T07:15:19Z

Linguistic challenges in automatic summarization technology Diedrichsen, Elke [EN] Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.

2017-07-07T07:15:19Z Computing the meaning of the assertive speech act by a software agent Nolan, Brian https://riunet.upv.es:443/handle/10251/84656 2023-11-21T11:49:19Z 2017-07-07T07:11:34Z

Computing the meaning of the assertive speech act by a software agent Nolan, Brian [EN] This paper examines the nature of the assertive speech act of Irish. We examine the syntactical constructional form of the assertive to identify its constructional signature. We consider the speech act as a construction whose meaning as an utterance depends on the framing situation and context, along with the common ground of the interlocutors. We identify how the assertive speech act is formalised to make it computer tractable for a software agent to compute its meaning, taking into account the contribution of situation, context and a dynamic common ground. Belief, desire and intention play a role in what is meant as against what is said. The nature of knowledge, and how it informs common ground, is explored along with the relationship between knowledge and language. Computing the meaning of a speech act in the situation requires us to consider the level of the interaction of all these dimensions. We argue that the contribution of lexicon and grammar, with the recognition of belief, desire and intentions in the situation type and associated illocutionary force, sociocultural conventions of the interlocutors along with their respective general and cultural knowledge, their common ground and other sources of contextual information are all important for representing meaning in communication. We show that the influence of the situation, context and common ground feeds into the utterance meaning derivation. The ‘what is said’ is reflected in the event and its semantics, while the ‘what is meant’ is derived at a higher level of abstraction within a situation.

2017-07-07T07:11:34Z Crimes against children: an apparently terminological knowledge representation of entities in FunGramKB Alameda Hernández, Ángela Felices Lago, Ángel https://riunet.upv.es:443/handle/10251/84651 2023-11-21T11:49:19Z 2017-07-07T07:09:07Z

Crimes against children: an apparently terminological knowledge representation of entities in FunGramKB Alameda Hernández, Ángela; Felices Lago, Ángel [EN] This article describes an example of the difficulties involved in the construction of a term-based satellite (or domain-specific) ontology integrated in FunGramKB –a lexico-conceptual knowledge base for the computational processing of natural language (Periñán-Pascual & Arcas-Túnez 2004, 2007, 2010a; Periñán-Pascual & Mairal-Usón 2009, 2010). The main hypothesis is that the multilevel model of FunGramKB Core Ontology can be connected to terminological satellite ontologies in order to minimize redundancy and maximize information (Periñán-Pascual & Arcas-Túnez 2010b). To this end we follow the four-phase COHERENT methodology (Periñán-Pascual & Mairal-Usón 2011): COnceptualization, HiErarchization, REmodelling and refinemeNT. In doing so, the paper furnishes substantial evidence on the structuring of a set of concepts borrowed from criminal law, an apparently terminological domain (cf. Breuker, Valente & Winkels 2005; Valente 2005; Breuker, Casanovas & Klein 2008). The Globalcrimeterm corpus has been used as an empirical foundation (Felices-Lago & Ureña-Gómez Moreno 2012, 2014). To illustrate this process, we have selected the superordinate basic concept +CRIME_00 (Alameda-Hernández & Felices-Lago 2016) and its basic and terminal subordinate concepts in the domains of organized crime and terrorism (all of them under the metaconcept #ENTITY), particularly those crimes referring predominantly to children or involving children with other vulnerable groups. The creation of specific definitions for the target concepts in this paper uses COREL (a conceptual representation language (Periñán-Pascual & Mairal-Usón 2010)) and the following upper-level conceptual path: #ENTITY> #PHYSICAL> #PROCESS> +OCCURRENCE_00> +CRIME_00. Consequently, the modelling, subsumption and hierarchisation of concepts such as $ABDUCTION_00, $CHILD_ABUSE_00, $CHILD_PORNOGRAPHY_00, $COERCE_D_00, $CHILD_TRAFFICKING_00, $MOLEST_D_00, $FORCED_LABOUR, among others, are presented.

2017-07-07T07:09:07Z