On the influence of model fragment properties on a machine learning-based approach for feature location

Ballarin, Manuel; Marcén, Ana C.; Pelechano Ferragud, Vicente; Cetina, Carlos

doi:10.1016/j.infsof.2020.106430

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

On the influence of model fragment properties on a machine learning-based approach for feature location

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: BallarinMarcenPel ...

Tamaño: 3.192Mb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: IST_2020_Editorial.pdf

Tamaño: 1.795Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Ballarin, Manuel	es_ES
dc.contributor.author	Marcén, Ana C.	es_ES
dc.contributor.author	Pelechano Ferragud, Vicente	es_ES
dc.contributor.author	Cetina, Carlos	es_ES
dc.date.accessioned	2021-09-02T03:31:25Z
dc.date.available	2021-09-02T03:31:25Z
dc.date.issued	2021-01	es_ES
dc.identifier.issn	0950-5849	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/171217
dc.description.abstract	[EN] Context: Leveraging machine learning techniques to address feature location on models has been gaining attention. Machine learning techniques empower software product companies to take advantage of the knowledge and the experience to improve the performance of the feature location process. Most of the machine learning-based works for feature location on models report the machine learning techniques and the tuning parameters in detail. However, these works focus on the size and the distribution of the data sets, neglecting the properties of their contents. Objective: In this paper, we analyze the influence of three model fragment properties (density, multiplicity, and dispersion) on a machine learning-based approach for feature location. Method: The analysis of these properties is based on an industrial case provided by CAF, a worldwide provider of railway solutions. The test cases were evaluated through a machine learning technique that uses different subsets of a knowledge base to learn how to locate unknown features. Results: Results show that the density and dispersion properties have a direct impact on the results. In our case study, the model fragments with extra-small density values achieve results with up to 43% more precision, 41% more recall, 42% more F-measure, and 0.53 more Matthews Correlation Coefficient (MCC) than the model fragments with other density values. On the other hand, the model fragments with extra-small and small dispersion values achieve results with up to 53% more precision, 52% more recall, 52% more F-measure, and 0.57 more MCC than the model fragments with other dispersion values. Conclusions: The analysis of the results shows that both density and dispersion properties significantly influence the results. These results can serve not only to improve the reports by means of the model fragment properties, but also to be able to compare machine learning-based feature location approaches fairly improving the feature location results.	es_ES
dc.description.sponsorship	This work has been partially supported by the Ministry of Economy and Competitiveness (MINECO), Spain through the Spanish National R+D+i Plan and ERDF funds under the Project ALPS (RTI2018096411-B-I00). We also thank the ITEA3 15010 REVaMP2 Project and ACIF/2018/171.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Elsevier	es_ES
dc.relation.ispartof	Information and Software Technology	es_ES
dc.rights	Reconocimiento - No comercial - Sin obra derivada (by-nc-nd)	es_ES
dc.subject	Model fragment location	es_ES
dc.subject	Feature location	es_ES
dc.subject	Machine learning	es_ES
dc.subject	Learning to rank	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	On the influence of model fragment properties on a machine learning-based approach for feature location	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1016/j.infsof.2020.106430	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-096411-B-I00/ES/ASISTENTES EVOLUTIVOS INTELIGENTES PARA INICIAR LINEAS DE PRODUCTO SOFTWARE/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GVA//ACIF%2F2018%2F171/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Ballarin, M.; Marcén, AC.; Pelechano Ferragud, V.; Cetina, C. (2021). On the influence of model fragment properties on a machine learning-based approach for feature location. Information and Software Technology. 129:1-19. https://doi.org/10.1016/j.infsof.2020.106430	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1016/j.infsof.2020.106430	es_ES
dc.description.upvformatpinicio	1	es_ES
dc.description.upvformatpfin	19	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	129	es_ES
dc.relation.pasarela	S\418275	es_ES
dc.contributor.funder	Generalitat Valenciana	es_ES
dc.contributor.funder	Agencia Estatal de Investigación	es_ES
dc.contributor.funder	European Regional Development Fund	es_ES
dc.description.references	Marcén, A. C., Lapeña, R., Pastor, Ó., & Cetina, C. (2020). Traceability Link Recovery between Requirements and Models using an Evolutionary Algorithm Guided by a Learning to Rank Algorithm: Train control and management case. Journal of Systems and Software, 163, 110519. doi:10.1016/j.jss.2020.110519	es_ES
dc.description.references	Pérez, F., Font, J., Arcega, L., & Cetina, C. (2019). Collaborative feature location in models through automatic query expansion. Automated Software Engineering, 26(1), 161-202. doi:10.1007/s10515-019-00251-9	es_ES
dc.description.references	ZHUANG, X., ENGEL, B. A., LOZANO-GARCIA, D. F., FERNÁNDEZ, R. N., & JOHANNSEN, C. J. (1994). Optimization of training data required for neuro-classification. International Journal of Remote Sensing, 15(16), 3271-3277. doi:10.1080/01431169408954326	es_ES
dc.description.references	Foody, G. M., & Mathur, A. (2004). A relative evaluation of multiclass image classification by support vector machines. IEEE Transactions on Geoscience and Remote Sensing, 42(6), 1335-1343. doi:10.1109/tgrs.2004.827257	es_ES
dc.description.references	Foody, G. M., Mathur, A., Sanchez-Hernandez, C., & Boyd, D. S. (2006). Training set size requirements for the classification of a specific class. Remote Sensing of Environment, 104(1), 1-14. doi:10.1016/j.rse.2006.03.004	es_ES
dc.description.references	Weiss, G. M., & Provost, F. (2003). Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction. Journal of Artificial Intelligence Research, 19, 315-354. doi:10.1613/jair.1199	es_ES
dc.description.references	Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249-259. doi:10.1016/j.neunet.2018.07.011	es_ES
dc.description.references	Arcuri, A., & Fraser, G. (2013). Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 18(3), 594-623. doi:10.1007/s10664-013-9249-9	es_ES
dc.description.references	Lapeña, R., Font, J., Pastor, Ó., & Cetina, C. (2017). Analyzing the impact of natural language processing over feature location in models. ACM SIGPLAN Notices, 52(12), 63-76. doi:10.1145/3170492.3136052	es_ES
dc.description.references	Shabtai, A., Moskovitch, R., Elovici, Y., & Glezer, C. (2009). Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey. Information Security Technical Report, 14(1), 16-29. doi:10.1016/j.istr.2009.03.003	es_ES
dc.description.references	Song, Q., Jia, Z., Shepperd, M., Ying, S., & Liu, J. (2011). A General Software Defect-Proneness Prediction Framework. IEEE Transactions on Software Engineering, 37(3), 356-370. doi:10.1109/tse.2010.90	es_ES
dc.description.references	Cao, Z., Tian, Y., Le, T.-D. B., & Lo, D. (2018). Rule-based specification mining leveraging learning to rank. Automated Software Engineering, 25(3), 501-530. doi:10.1007/s10515-018-0231-z	es_ES
dc.description.references	Arcuri, A., & Briand, L. (2012). A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24(3), 219-250. doi:10.1002/stvr.1486	es_ES
dc.description.references	García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences, 180(10), 2044-2064. doi:10.1016/j.ins.2009.12.010	es_ES
dc.description.references	Falessi, D., Di Penta, M., Canfora, G., & Cantone, G. (2016). Estimating the number of remaining links in traceability recovery. Empirical Software Engineering, 22(3), 996-1027. doi:10.1007/s10664-016-9460-6	es_ES
dc.description.references	Jialei Wang, Peilin Zhao, Hoi, S. C. H., & Rong Jin. (2014). Online Feature Selection and Its Applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698-710. doi:10.1109/tkde.2013.32	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [46097]

Mostrar el registro sencillo del ítem

On the influence of model fragment properties on a machine learning-based approach for feature location

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

On the influence of model fragment properties on a machine learning-based approach for feature location

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)