- -

On the evaluation and improvement of arabic wordnet coverage and usability

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

On the evaluation and improvement of arabic wordnet coverage and usability

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Abouenour, Lahsen es_ES
dc.contributor.author Bouzoubaa, Karim es_ES
dc.contributor.author Rosso, Paolo es_ES
dc.date.accessioned 2014-07-07T14:10:34Z
dc.date.issued 2013-09
dc.identifier.issn 1574-020X
dc.identifier.uri http://hdl.handle.net/10251/38642
dc.description The final publication is available at Springer via http://dx.doi.org/10.1007/s10579-013-9237-0 es_ES
dc.description.abstract [EN] Built on the basis of the methods developed for Princeton WordNet and EuroWordNet, Arabic WordNet (AWN) has been an interesting project which combines WordNet structure compliance with Arabic particularities. In this paper, some AWN shortcomings related to coverage and usability are addressed. The use of AWN in question/answering (Q/A) helped us to deeply evaluate the resource from an experience-based perspective. Accordingly, an enrichment of AWN was built by semi-automatically extending its content. Indeed, existing approaches and/or resources developed for other languages were adapted and used for AWN. The experiments conducted in Arabic Q/A have shown an improvement of both AWN coverage as well as usability. Concerning coverage, a great amount of named entities extracted from YAGO were connected with corresponding AWN synsets. Also, a significant number of new verbs and nouns (including Broken Plural forms) were added. In terms of usability, thanks to the use of AWN, the performance for the AWN-based Q/A application registered an overall improvement with respect to the following three measures: accuracy (+9.27 % improvement), mean reciprocal rank (+3.6 improvement) and number of answered questions (+12.79 % improvement). es_ES
dc.description.sponsorship The work presented in Sect. 2.2 was done in the framework of the bilateral Spain-Morocco AECID-PCI C/026728/09 research project. The research of the two first authors is done in the framework of the PROGRAMME D'URGENCE project (grant no. 03/2010). The research of the third author is done in the framework of WIQEI IRSES project (grant no. 269180) within the FP 7 Marie Curie People, DIANA-APPLICATIONS-Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01) research project and VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. We would like to thank Manuel Montes-y-Gomez (INAOE-Puebla, Mexico) and Sandra Garcia-Blasco (Bitsnbrain, Spain) for their feedback on the work presented in Sect. 2.4. We would like finally to thank Violetta Cavalli-Sforza (Al Akhawayn University in Ifrane, Morocco) for having reviewed the linguistic level of the entire document. en_EN
dc.format.extent 27 es_ES
dc.language Inglés es_ES
dc.publisher Springer Netherlands es_ES
dc.relation.ispartof Language Resources and Evaluation es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Arabic WordNet es_ES
dc.subject Hyponymy extraction es_ES
dc.subject Maximal frequent sequence es_ES
dc.subject WordNet-based application es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title On the evaluation and improvement of arabic wordnet coverage and usability es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s10579-013-9237-0
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MAEC//C%2F026728%2F09/ES/EXTENSIÓN Y MEJORA DE RESURSOS LÉXICOS EN ÁRABE/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Abouenour, L.; Bouzoubaa, K.; Rosso, P. (2013). On the evaluation and improvement of arabic wordnet coverage and usability. Language Resources and Evaluation. 47(3):891-917. https://doi.org/10.1007/s10579-013-9237-0 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://link.springer.com/article/10.1007%2Fs10579-013-9237-0 es_ES
dc.description.upvformatpinicio 891 es_ES
dc.description.upvformatpfin 917 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 47 es_ES
dc.description.issue 3 es_ES
dc.relation.senia 255790
dc.identifier.eissn 1574-0218
dc.contributor.funder European Commission
dc.contributor.funder Agencia Española de Cooperación Internacional para el Desarrollo
dc.contributor.funder Ministerio de Ciencia e Innovación
dc.contributor.funder Ministerio de Asuntos Exteriores y Cooperación es_ES
dc.description.references Abbès, R., Dichy, J., & Hassoun, M. (2004). The architecture of a standard Arabic lexical database: Some figures, ratios and categories from the DIINAR.1 source program. In Workshop on computational approaches to Arabic script-based languages, Coling 2004. Geneva, Switzerland. es_ES
dc.description.references Abouenour, L., Bouzoubaa, K., & Rosso, P. (2009a). Structure-based evaluation of an Arabic semantic query expansion using the JIRS passage retrieval system. In Proceedings of the workshop on computational approaches to Semitic languages, E-ACL-2009, Athens, Greece, March. es_ES
dc.description.references Abouenour, L., Bouzoubaa, K., & Rosso, P. (2009b). Three-level approach for passage retrieval in Arabic question/answering systems. In Proceedings of the 3rd international conference on Arabic language processing CITALA’09, Rabat, Morocco, May, 2009. es_ES
dc.description.references Abouenour, L., Bouzoubaa, K., & Rosso, P. (2010a). An evaluated semantic query expansion and structure-based approach for enhancing Arabic question/answering. Special Issue in the International Journal on Information and Communication Technologies/IEEE. June. es_ES
dc.description.references Abouenour, L., Bouzoubaa, K., & Rosso, P. (2010b). Using the YAGO ontology as a resource for the enrichment of named entities in Arabic WordNet. In Workshop LR & HLT for semitic languages, LREC’10. Malta. May, 2010. es_ES
dc.description.references Ahonen-Myka, H. (2002). Discovery of frequent word sequences in text. In Proceedings of the ESF exploratory workshop on pattern detection and discovery (pp. 180–189). London, UK: Springer. es_ES
dc.description.references Al Khalifa, M., & Rodríguez, H. (2009). Automatically extending NE coverage of Arabic WordNet using Wikipedia. In Proceedings of the 3rd international conference on Arabic language processing CITALA’09, May, Rabat, Morocco. es_ES
dc.description.references Alotaiby, F., Alkharashi, I., & Foda, S. (2009). Processing large Arabic text corpora: Preliminary analysis and results. In Proceedings of the second international conference on Arabic language resources and tools (pp. 78–82), Cairo, Egypt. es_ES
dc.description.references Baker, C. F., Fillmore, C. J., & Cronin, B. (2003). The structure of the FrameNet database. International Journal of Lexicography, 16(3), 281–296. es_ES
dc.description.references Baldwin, T., Pool, P., & Colowick, S. M. (2010). PanLex and LEXTRACT: Translating all words of all languages of the world. In Proceedings of Coling 2010, demonstration volume (pp. 37–40), Beijing. es_ES
dc.description.references Benajiba, Y., Diab, M., & Rosso, P. (2009). Using language independent and language specific features to enhance Arabic named entity recognition. In IEEE transactions on audio, speech and language processing. Special Issue on Processing Morphologically Rich Languages, 17(5), 2009. es_ES
dc.description.references Benajiba, Y., Rosso, P., & Lyhyaoui, A. (2007). Implementation of the ArabiQA question answering system’s components. In Proceedings of workshop on Arabic natural language processing, 2nd Information Communication Technologies int. symposium, ICTIS-2007, April 3–5, Fez, Morocco. es_ES
dc.description.references Benoît, S., & Darja, F. (2008). Building a free French WordNet from multilingual resources. Workshop on Ontolex 2008, LREC’08, June, Marrakech, Morocco. es_ES
dc.description.references Black, W., Elkateb, S., Rodriguez, H, Alkhalifa, M., Vossen, P., Pease, A., et al. (2006). Introducing the Arabic WordNet project. In Proceedings of the third international WordNet conference. Sojka, Choi: Fellbaum & Vossen (eds). es_ES
dc.description.references Boudelaa, S., & Gaskell, M. G. (2002). A reexamination of the default system for Arabic plurals. Language and Cognitive Processes, 17, 321–343. es_ES
dc.description.references Brini, W., Ellouze & M., Hadrich, B. L. (2009a). QASAL: Un système de question-réponse dédié pour les questions factuelles en langue Arabe. In 9th Journées Scientifiques des Jeunes Chercheurs en Génie Electrique et Informatique, Tunisia. es_ES
dc.description.references Brini, W., Trigui, O., Ellouze, M., Mesfar, S., Hadrich, L., & Rosso, P. (2009b). Factoid and definitional Arabic question answering system. In Post-proceedings of NOOJ-2009, June 8–10, Tozeur, Tunisia. es_ES
dc.description.references Buscaldi, D., Rosso, P., Gómez, J. M., & Sanchis, E. (2010). Answering questions with an n-gram based passage retrieval engine. Journal of Intelligent Information Systems, 34(2), 113–134. es_ES
dc.description.references Costa, R. P., & Seco, N. (2008). Hyponymy extraction and Web search behavior analysis based on query reformulation. In Proceedings of the 11th Ibero-American conference on AI: advances in artificial intelligence (pp. 1–10). es_ES
dc.description.references Denicia-carral, C., Montes-y-Gõmez, M., Villaseñor-pineda, L., & Hernandez, R. G. (2006). A text mining approach for definition question answering. In Proceedings of the 5th international conference on natural language processing, FinTal’2006, Turku, Finland. es_ES
dc.description.references Diab, M. T. (2004). Feasibility of bootstrapping an Arabic Wordnet leveraging parallel corpora and an English Wordnet. In Proceedings of the Arabic language technologies and resources, NEMLAR, Cairo, Egypt. es_ES
dc.description.references El Amine, M. A. (2009). Vers une interface pour l’enrichissement des requêtes en arabe dans un système de recherche d’information. In Proceedings of the 2nd conférence internationale sur l’informatique et ses applications (CIIA’09), May 3–4, Saida, Algeria. es_ES
dc.description.references Elghamry, K. (2008). Using the web in building a corpus-based hypernymy-hyponymy Lexicon with hierarchical structure for Arabic. In Proceedings of the 6th international conference on informatics and systems, INFOS 2008. Cairo, Egypt. es_ES
dc.description.references Elkateb, S., Black, W., Vossen, P., Farwell, D., Rodríguez, H., Pease, A., et al. (2006). Arabic WordNet and the challenges of Arabic. In Proceedings of Arabic NLP/MT conference, London, UK. es_ES
dc.description.references Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. MA: MIT Press. es_ES
dc.description.references García-Blasco, S., Danger, R., & Rosso, P. (2010). Drug–drug interaction detection: A new approach based on maximal frequent sequences. Sociedad Española para el Procesamiento del Lenguaje Natural, SEPLN, 45, 263–266. es_ES
dc.description.references García-Hernández, R. A. (2007). Algoritmos para el descubrimiento de patrones secuenciales maximales. Ph.D. Thesis, INAOE. September, Mexico. es_ES
dc.description.references García-Hernández, R. A., Martínez Trinidad, J. F., & Carrasco-ochoa, J. A. (2010). Finding maximal sequential patterns in text document collections and single documents. Informatica, 34(1), 93–101. es_ES
dc.description.references Goweder, A., & De Roeck, A. (2001). Assessment of a significant Arabic corpus. In Proceedings of the Arabic NLP workshop at ACL/EACL, (pp. 73–79), Toulouse, France. es_ES
dc.description.references Graff, D. (2007). Arabic Gigaword (3rd ed.). Philadelphia, USA: Linguistic Data Consortium. es_ES
dc.description.references Graff, D., Kong, J., Chen, K., & Maeda, K. (2007). English Gigaword (3rd ed.). Philadelphia, USA: Linguistic Data Consortium. es_ES
dc.description.references Hammou, B., Abu-salem, H., Lytinen, S., & Evens, M. (2002). QARAB: A question answering system to support the Arabic language. In Proceedings of the workshop on computational approaches to Semitic languages, ACL, (pp. 55–65), Philadelphia. es_ES
dc.description.references Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics, COLING ‘92 (vol. 2, pp. 539–545). es_ES
dc.description.references Kanaan, G., Hammouri, A., Al-Shalabi, R., & Swalha, M. (2009). A new question answering system for the Arabic language. American Journal of Applied Sciences, 6(4), 797–805. es_ES
dc.description.references Kim, H., Chen, S., & Veale, T. (2006). Analogical reasoning with a synergy of HowNet and WordNet. In Proceedings of GWC’2006, the 3rd global WordNet conference, January, Cheju, Korea. es_ES
dc.description.references Kipper-Schuler, K. (2006). VerbNet: A broad-coverage, comprehensive verb lexicon. Ph.D. Thesis. es_ES
dc.description.references Mohammed, F. A., Nasser, K., & Harb, H. M. (1993). A knowledge-based Arabic question answering system (AQAS). In ACM SIGART bulletin (pp. 21–33). es_ES
dc.description.references Niles, I., & Pease, A. (2001). Towards a standard upper ontology. In Proceedings of FOIS-2 (pp. 2–9), Ogunquit, Maine. es_ES
dc.description.references Niles, I., & Pease, A. (2003). Linking lexicons and ontologies: Mapping WordNet to the suggested upper merged ontology. In Proceedings of the 2003 international conference on information and knowledge engineering, Las Vegas, Nevada. es_ES
dc.description.references Ortega-Mendoza, R. M., Villaseñor-pineda, L., & Montes-y-Gõmez, M. (2007). Using lexical patterns to extract hyponyms from the Web. In Proceedings of the Mexican international conference on artificial intelligence MICAI 2007. November, Aguascalientes, Mexico. Lecture Notes in Artificial Intelligence 4827. Berlin: Springer. es_ES
dc.description.references Palmer, M., P. Kingsbury, & D. Gildea. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 21. USA: MIT Press. es_ES
dc.description.references Pantel, P., & Pennacchiotti, M. (2006). Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of conference on computational linguistics association for computational linguistics, (pp. 113–120), Sydney, Australia. es_ES
dc.description.references Rodriguez, H., Farwell, D., Farreres, J., Bertran, M., Alkhalifa, M., & Martí, A. (2008a). Arabic WordNet: Semi-automatic extensions using Bayesian Inference. In Proceedings of the the 6th conference on language resources and evaluation LREC2008, May, Marrakech, Morocco. es_ES
dc.description.references Rodriguez, H., Farwell, D., Farreres, J., Bertran, M., Alkhalifa, M., Mart., M., et al. (2008b). Arabic WordNet: Current state and future extensions. In Proceedings of the fourth global WordNet conference, January 22–25, Szeged, Hungary. es_ES
dc.description.references Sharaf, A. M. (2009). The Qur’an annotation for text mining. First year transfer report. School of Computing, Leeds University. December. es_ES
dc.description.references Snow, R., Jurafsky, D., & Andrew, Y. N. (2005). Learning syntactic patterns for automatic hypernym discovery. In Lawrence K. Saul et al. (Eds.), Advances in neural information processing systems, 17. Cambridge, MA: MIT Press. es_ES
dc.description.references Suchanek, F. M., Kasneci, G., & Weikum, G. (2007). YAGO: A core of semantic knowledge unifying WordNet and Wikipedia. In Proceedings of 16th international World Wide Web conference WWW’2007, (pp. 697–706), May, Banff, Alberta, Canada: ACM Press. es_ES
dc.description.references Tjong Kim Sang, E., & Hofmann, K. (2007). Automatic extraction of Dutch hypernym–hyponym pairs. In Proceedings of CLIN-2006, Leuven, Belgium. es_ES
dc.description.references Toral, A., Munoz, R., & Monachini, M. (2008). Named entity WordNet. In Proceedings of the Sixth international conference on language resources and evaluation (LREC’08), Marrakech, Morocco. es_ES
dc.description.references Vossen, P. (Ed.). (1998). EuroWordNet, a multilingual database with lexical semantic networks. The Netherlands: Kluwer. es_ES
dc.description.references Wagner, A. (2005). Learning thematic role relations for lexical semantic nets. Ph.D. Thesis, University of Tübingen, 2005. es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem