- -

Big Data Matching Using the Identity Correlation Approach

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Big Data Matching Using the Identity Correlation Approach

Show simple item record

Files in this item

dc.contributor.author Smyth, Mary es_ES
dc.contributor.author McCormack, Kevin es_ES
dc.date.accessioned 2017-07-10T06:48:28Z
dc.date.available 2017-07-10T06:48:28Z
dc.date.issued 2016-10-10
dc.identifier.isbn 9788490484623
dc.identifier.uri http://hdl.handle.net/10251/84779
dc.description.abstract [EN] The Identity Correlation Approach (ICA) is a statistical technique developed for matching big data where a unique identifier does not exist. This technique was developed to match the Irish Census 2011 dataset to Central Government Administrative Datasets in order to attach a unique identifier to each individual person in the Census dataset (McCormack & Smyth, 20151). The unique identifier attached is the PPS No. (Personal Public Service No.2). By attaching the PPS No. to the Census dataset, each individual can be linked to datasets held centrally by Public Sector Organisations. This expands the range of variables for statistical analysis at individual level. Statistical techniques developed here were undertaken for a major European Structure of Earnings Survey (SES) compiled by the CSO using administrative data only, and thus eliminating the need for an expensive business survey to be conducted (NES, 20073,4,5). A description of how the Identity Correlation Approach was developed is given in this paper. Data matching results and conclusions are presented here in relation to the Structure of Earnings Survey (SES)6 results for 2011. es_ES
dc.format.extent 10 es_ES
dc.language Inglés es_ES
dc.publisher Editorial Universitat Politècnica de València es_ES
dc.relation.ispartof CARMA 2016: 1st International Conference on Advanced Research Methods in Analytics es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject web data es_ES
dc.subject internet data es_ES
dc.subject big data es_ES
dc.subject qca es_ES
dc.subject pls es_ES
dc.subject sem es_ES
dc.subject conference es_ES
dc.title Big Data Matching Using the Identity Correlation Approach es_ES
dc.type Capítulo de libro es_ES
dc.type Comunicación en congreso es_ES
dc.identifier.doi 10.4995/CARMA2016.2015.2991
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Smyth, M.; Mccormack, K. (2016). Big Data Matching Using the Identity Correlation Approach. En CARMA 2016: 1st International Conference on Advanced Research Methods in Analytics. Editorial Universitat Politècnica de València. 46-55. doi:10.4995/CARMA2016.2015.2991. es_ES
dc.description.accrualMethod OCS es_ES
dc.relation.conferencename CARMA 2016 - 1st International Conference on Advanced Research Methods and Analytics es_ES
dc.relation.conferencedate July 06-07,2016 es_ES
dc.relation.conferenceplace Valencia, Spain es_ES
dc.relation.publisherversion http://ocs.editorial.upv.es/index.php/CARMA/CARMA2016/paper/view/2991 es_ES
dc.description.upvformatpinicio 46 es_ES
dc.description.upvformatpfin 55 es_ES
dc.type.version info:eu repo/semantics/publishedVersion es_ES
dc.relation.pasarela 2991 es_ES


This item appears in the following Collection(s)

Show simple item record