Mostrar el registro sencillo del ítem
dc.contributor.author | Ballarin, Manuel | es_ES |
dc.contributor.author | Marcén, Ana C. | es_ES |
dc.contributor.author | Pelechano Ferragud, Vicente | es_ES |
dc.contributor.author | Cetina, Carlos | es_ES |
dc.date.accessioned | 2021-09-02T03:31:25Z | |
dc.date.available | 2021-09-02T03:31:25Z | |
dc.date.issued | 2021-01 | es_ES |
dc.identifier.issn | 0950-5849 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/171217 | |
dc.description.abstract | [EN] Context: Leveraging machine learning techniques to address feature location on models has been gaining attention. Machine learning techniques empower software product companies to take advantage of the knowledge and the experience to improve the performance of the feature location process. Most of the machine learning-based works for feature location on models report the machine learning techniques and the tuning parameters in detail. However, these works focus on the size and the distribution of the data sets, neglecting the properties of their contents. Objective: In this paper, we analyze the influence of three model fragment properties (density, multiplicity, and dispersion) on a machine learning-based approach for feature location. Method: The analysis of these properties is based on an industrial case provided by CAF, a worldwide provider of railway solutions. The test cases were evaluated through a machine learning technique that uses different subsets of a knowledge base to learn how to locate unknown features. Results: Results show that the density and dispersion properties have a direct impact on the results. In our case study, the model fragments with extra-small density values achieve results with up to 43% more precision, 41% more recall, 42% more F-measure, and 0.53 more Matthews Correlation Coefficient (MCC) than the model fragments with other density values. On the other hand, the model fragments with extra-small and small dispersion values achieve results with up to 53% more precision, 52% more recall, 52% more F-measure, and 0.57 more MCC than the model fragments with other dispersion values. Conclusions: The analysis of the results shows that both density and dispersion properties significantly influence the results. These results can serve not only to improve the reports by means of the model fragment properties, but also to be able to compare machine learning-based feature location approaches fairly improving the feature location results. | es_ES |
dc.description.sponsorship | This work has been partially supported by the Ministry of Economy and Competitiveness (MINECO), Spain through the Spanish National R+D+i Plan and ERDF funds under the Project ALPS (RTI2018096411-B-I00). We also thank the ITEA3 15010 REVaMP2 Project and ACIF/2018/171. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Information and Software Technology | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Model fragment location | es_ES |
dc.subject | Feature location | es_ES |
dc.subject | Machine learning | es_ES |
dc.subject | Learning to rank | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | On the influence of model fragment properties on a machine learning-based approach for feature location | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.infsof.2020.106430 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-096411-B-I00/ES/ASISTENTES EVOLUTIVOS INTELIGENTES PARA INICIAR LINEAS DE PRODUCTO SOFTWARE/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//ACIF%2F2018%2F171/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Ballarin, M.; Marcén, AC.; Pelechano Ferragud, V.; Cetina, C. (2021). On the influence of model fragment properties on a machine learning-based approach for feature location. Information and Software Technology. 129:1-19. https://doi.org/10.1016/j.infsof.2020.106430 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.infsof.2020.106430 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 19 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 129 | es_ES |
dc.relation.pasarela | S\418275 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | Agencia Estatal de Investigación | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.description.references | Marcén, A. C., Lapeña, R., Pastor, Ó., & Cetina, C. (2020). Traceability Link Recovery between Requirements and Models using an Evolutionary Algorithm Guided by a Learning to Rank Algorithm: Train control and management case. Journal of Systems and Software, 163, 110519. doi:10.1016/j.jss.2020.110519 | es_ES |
dc.description.references | Pérez, F., Font, J., Arcega, L., & Cetina, C. (2019). Collaborative feature location in models through automatic query expansion. Automated Software Engineering, 26(1), 161-202. doi:10.1007/s10515-019-00251-9 | es_ES |
dc.description.references | ZHUANG, X., ENGEL, B. A., LOZANO-GARCIA, D. F., FERNÁNDEZ, R. N., & JOHANNSEN, C. J. (1994). Optimization of training data required for neuro-classification. International Journal of Remote Sensing, 15(16), 3271-3277. doi:10.1080/01431169408954326 | es_ES |
dc.description.references | Foody, G. M., & Mathur, A. (2004). A relative evaluation of multiclass image classification by support vector machines. IEEE Transactions on Geoscience and Remote Sensing, 42(6), 1335-1343. doi:10.1109/tgrs.2004.827257 | es_ES |
dc.description.references | Foody, G. M., Mathur, A., Sanchez-Hernandez, C., & Boyd, D. S. (2006). Training set size requirements for the classification of a specific class. Remote Sensing of Environment, 104(1), 1-14. doi:10.1016/j.rse.2006.03.004 | es_ES |
dc.description.references | Weiss, G. M., & Provost, F. (2003). Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction. Journal of Artificial Intelligence Research, 19, 315-354. doi:10.1613/jair.1199 | es_ES |
dc.description.references | Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249-259. doi:10.1016/j.neunet.2018.07.011 | es_ES |
dc.description.references | Arcuri, A., & Fraser, G. (2013). Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 18(3), 594-623. doi:10.1007/s10664-013-9249-9 | es_ES |
dc.description.references | Lapeña, R., Font, J., Pastor, Ó., & Cetina, C. (2017). Analyzing the impact of natural language processing over feature location in models. ACM SIGPLAN Notices, 52(12), 63-76. doi:10.1145/3170492.3136052 | es_ES |
dc.description.references | Shabtai, A., Moskovitch, R., Elovici, Y., & Glezer, C. (2009). Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey. Information Security Technical Report, 14(1), 16-29. doi:10.1016/j.istr.2009.03.003 | es_ES |
dc.description.references | Song, Q., Jia, Z., Shepperd, M., Ying, S., & Liu, J. (2011). A General Software Defect-Proneness Prediction Framework. IEEE Transactions on Software Engineering, 37(3), 356-370. doi:10.1109/tse.2010.90 | es_ES |
dc.description.references | Cao, Z., Tian, Y., Le, T.-D. B., & Lo, D. (2018). Rule-based specification mining leveraging learning to rank. Automated Software Engineering, 25(3), 501-530. doi:10.1007/s10515-018-0231-z | es_ES |
dc.description.references | Arcuri, A., & Briand, L. (2012). A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24(3), 219-250. doi:10.1002/stvr.1486 | es_ES |
dc.description.references | García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences, 180(10), 2044-2064. doi:10.1016/j.ins.2009.12.010 | es_ES |
dc.description.references | Falessi, D., Di Penta, M., Canfora, G., & Cantone, G. (2016). Estimating the number of remaining links in traceability recovery. Empirical Software Engineering, 22(3), 996-1027. doi:10.1007/s10664-016-9460-6 | es_ES |
dc.description.references | Jialei Wang, Peilin Zhao, Hoi, S. C. H., & Rong Jin. (2014). Online Feature Selection and Its Applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698-710. doi:10.1109/tkde.2013.32 | es_ES |