Mostrar el registro sencillo del ítem
dc.contributor.author | Ferrati, Francesco | es_ES |
dc.contributor.author | Muffatto, Moreno | es_ES |
dc.date.accessioned | 2020-07-30T10:44:33Z | |
dc.date.available | 2020-07-30T10:44:33Z | |
dc.date.issued | 2020-07-02 | |
dc.identifier.isbn | 9788490488324 | |
dc.identifier.uri | http://hdl.handle.net/10251/148975 | |
dc.description.abstract | [EN] In order to support equity investors in their decision-making process, researchers are exploring the potential of machine learning algorithms to predict the financial success of startup ventures. In this context, a key role is played by the significance of the data used, which should reflect most of the variables considered by investors in their screening and evaluation activity. This paper provides a detailed description of the data management process that can be followed to obtain such a dataset. Using Crunchbase as the main data source, other databases have been integrated to enrich the information content and support the feature engineering process. Specifically, the following sources has been considered: USPTO PatentsView, Kauffman Indicators of Entrepreneurship, Academic Ranking of World Universities, CB Insights ranking of top-investors. The final dataset contains the profiles of 138,637 US-based ventures founded between 2000 and 2019. For each company the elements assessed by equity investors have been analyzed. Among others, the following specific areas were considered for each company: location, industry, founding team, intellectual property and funding round history. Data related to each area have been formalized in a series of features ready to be used in a machine learning context. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Editorial Universitat Politècnica de València | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Web data | es_ES |
dc.subject | Internet data | es_ES |
dc.subject | Big data | es_ES |
dc.subject | Qca | es_ES |
dc.subject | Pls | es_ES |
dc.subject | Sem | es_ES |
dc.subject | Conference | es_ES |
dc.subject | Crunchbase | es_ES |
dc.subject | Startup | es_ES |
dc.subject | Investments | es_ES |
dc.subject | Feature engineering | es_ES |
dc.subject | Data mining | es_ES |
dc.subject | Machine learning | es_ES |
dc.title | Setting Crunchbase for Data Science: Preprocessing, Data Integration and Feature Engineering | es_ES |
dc.type | Capítulo de libro | es_ES |
dc.type | Comunicación en congreso | es_ES |
dc.identifier.doi | 10.4995/CARMA2020.2020.11633 | |
dc.rights.accessRights | Abierto | es_ES |
dc.description.bibliographicCitation | Ferrati, F.; Muffatto, M. (2020). Setting Crunchbase for Data Science: Preprocessing, Data Integration and Feature Engineering. Editorial Universitat Politècnica de València. 221-229. https://doi.org/10.4995/CARMA2020.2020.11633 | es_ES |
dc.description.accrualMethod | OCS | es_ES |
dc.relation.conferencename | CARMA 2020 - 3rd International Conference on Advanced Research Methods and Analytics | es_ES |
dc.relation.conferencedate | Julio 08-09,2020 | es_ES |
dc.relation.conferenceplace | Valencia, Spain | es_ES |
dc.relation.publisherversion | http://ocs.editorial.upv.es/index.php/CARMA/CARMA2020/paper/view/11633 | es_ES |
dc.description.upvformatpinicio | 221 | es_ES |
dc.description.upvformatpfin | 229 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.relation.pasarela | OCS\11633 | es_ES |