This README.txt file was generated on 2023-05-23 by Josep Domenech ------------------- GENERAL INFORMATION ------------------- Title of Dataset: Website indicators on textile companies in the Comunitat Valenciana region Author Information Principal Investigator: Josep Domenech, Universitat Politecnica de Valencia, Cami de Vera s/n, 46022 Valencia. Spain. jdomenech@upvnet.upv.es, http://orcid.org/0000-0002-7302-5810 Associate or Co-investigator: Ana Garcia Bernabeu, Universitat Politecnica de Valencia, Plaza Ferrándiz y Carbonell, s/n, 03801 Alcoi. Spain, angarber@esp.upv.es, http://orcid.org/0000-0003-3181-7745 Pablo Diaz Garcia, Universitat Politecnica de Valencia, Plaza Ferrándiz y Carbonell, s/n, 03801 Alcoi. Spain. pdiazga@txp.upv.es, http://orcid.org/0000-0002-7093-6061 Date of data collection: 2023-05-23 Geographic location of data collection: Comunitat Valenciana Information about funding sources or sponsorship that supported the collection of the data: This work was supported by grant PID2019-107765RB-I00 (funded by MCIN/AEI/10.13039/501100011033). General description: Sustainability indicators on website content after scraping textile companies in the Comunitat Valenciana region. Websites were scraped between October 2021 and January 2022. Keywords: sustainability, web scraping, textile industry. -------------------------- SHARING/ACCESS INFORMATION -------------------------- Open Access to data: Open Date end Embargo: Open Licenses/restrictions placed on the data, or limitations of reuse: ODC-By Citation for and links to publications that cite or use the data: To be published. Links/relationships to previous or related data sets: None Links to other publicly accessible locations of the data: None -------------------- DATA & FILE OVERVIEW -------------------- File list: webdata.csv Relationship between files: Type of version of the dataset: Raw indicators retrieved from the Web Economics Indicators framework. Total size: 86KB -------------------------- METHODOLOGICAL INFORMATION -------------------------- Description of methods used for collection/generation of data: Data collected with the same methodology as in https://doi.org/10.1007/s10479-023-05306-5 Methods for processing the data: See https://doi.org/10.1007/s10479-023-05306-5 Software- or Instrument-specific information needed to interpret the data, including software and hardware version numbers: CSV standard format -------------------------- DATA-SPECIFIC INFORMATION -------------------------- Number of variables: 79 Number of cases/rows: 454 Variable list, defining any abbreviations, units of measure, codes or symbols used: url: URL of the company website homepage href_ext_pdf: Number of files with PDF extension found on the href of the analyzed web pages. htmltags_xxx: Number of times that the xxx HTML tag was found on the scraped website (combining all pages). keywords_xxx: Takes value 1 if the literal xxx appeared on the text content of the website. words_xxx: Takes value 1 if the word xxx appeared on the text content of the website. sizehtml: Size in bytes of the HTMLs in the website. sizehtml: Size in bytes of the text in the website.