- -

Missing the missing values: The ugly duckling of fairness in machine learning

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Missing the missing values: The ugly duckling of fairness in machine learning

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Martínez-Plumed, Fernando es_ES
dc.contributor.author Ferri Ramírez, César es_ES
dc.contributor.author Nieves, David es_ES
dc.contributor.author Hernández-Orallo, José es_ES
dc.date.accessioned 2022-04-05T06:55:42Z
dc.date.available 2022-04-05T06:55:42Z
dc.date.issued 2021-07 es_ES
dc.identifier.issn 0884-8173 es_ES
dc.identifier.uri http://hdl.handle.net/10251/181819
dc.description.abstract [EN] Nowadays, there is an increasing concern in machine learning about the causes underlying unfair decision making, that is, algorithmic decisions discriminating some groups over others, especially with groups that are defined over protected attributes, such as gender, race and nationality. Missing values are one frequent manifestation of all these latent causes: protected groups are more reluctant to give information that could be used against them, sensitive information for some groups can be erased by human operators, or data acquisition may simply be less complete and systematic for minority groups. However, most recent techniques, libraries and experimental results dealing with fairness in machine learning have simply ignored missing data. In this paper, we present the first comprehensive analysis of the relation between missing values and algorithmic fairness for machine learning: (1) we analyse the sources of missing data and bias, mapping the common causes, (2) we find that rows containing missing values are usually fairer than the rest, which should discourage the consideration of missing values as the uncomfortable ugly data that different techniques and libraries for handling algorithmic bias get rid of at the first occasion, (3) we study the trade-off between performance and fairness when the rows with missing values are used (either because the technique deals with them directly or by imputation methods), and (4) we show that the sensitivity of six different machine-learning techniques to missing values is usually low, which reinforces the view that the rows with missing data contribute more to fairness through the other, nonmissing, attributes. We end the paper with a series of recommended procedures about what to do with missing data when aiming for fair decision making. es_ES
dc.description.sponsorship Ministerio de Economia, Industria y Competitividad, Gobierno de Espana (ES), Grant/Award Number: RTI2018-094403-B-C3; Generalitat Valenciana, Grant/Award Number: PROMETEO/2019/09; Future of Life Institute, Grant/Award Number: RFP2-15; European Commission, Grant/Award Number: DG JRC - HUMAINT project es_ES
dc.language Inglés es_ES
dc.publisher John Wiley & Sons es_ES
dc.relation.ispartof International Journal of Intelligent Systems es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Algorithmic bias es_ES
dc.subject Confirmation bias es_ES
dc.subject Data imputation es_ES
dc.subject Fairness es_ES
dc.subject Missing values es_ES
dc.subject Sample bias es_ES
dc.subject Survey bias es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Missing the missing values: The ugly duckling of fairness in machine learning es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1002/int.22415 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094403-B-C31/ES/RAZONAMIENTO FORMAL PARA TECNOLOGIAS FACILITADORAS Y EMERGENTES/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/FLI//RFP2-152/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/952215/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//PROMETEO%2F2019%2F098//DEEPTRUST/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Martínez-Plumed, F.; Ferri Ramírez, C.; Nieves, D.; Hernández-Orallo, J. (2021). Missing the missing values: The ugly duckling of fairness in machine learning. International Journal of Intelligent Systems. 36(7):3217-3258. https://doi.org/10.1002/int.22415 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1002/int.22415 es_ES
dc.description.upvformatpinicio 3217 es_ES
dc.description.upvformatpfin 3258 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 36 es_ES
dc.description.issue 7 es_ES
dc.relation.pasarela S\456129 es_ES
dc.contributor.funder Generalitat Valenciana es_ES
dc.contributor.funder Future of Life Institute es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.contributor.funder COMISION DE LAS COMUNIDADES EUROPEA es_ES
dc.subject.ods 08.- Fomentar el crecimiento económico sostenido, inclusivo y sostenible, el empleo pleno y productivo, y el trabajo decente para todos es_ES
dc.subject.ods 09.- Desarrollar infraestructuras resilientes, promover la industrialización inclusiva y sostenible, y fomentar la innovación es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem