Research in innovation usually builds on conventional data such as balance sheets, surveys, patents, or product catalogs. This paper intends to explore unconventional data, specifically web-scraped data, as an information source for innovation studies, proposing a careful procedure to establish the veracity of the linkage between web-based data and firm-level information retrieved from conventional sources. The study regards a sample of Italian manufacturing small and medium enterprises active in 2016, comprehending both innovative and non-innovative firms. It is based on HTML tags, whilst most of the previous literature worked on the web-pages text and related semantics. Our paper provides evidence that the way HTML language is applied to build a corporate website unveils the capabilities of the owner firm, helping to distinguish innovative from non-innovative SMEs.
Bottai, C., Crosato, L., Domenech, J., Guerzoni, M., Liberati, C. (2023). Can websites reveal a firm’s innovativeness? Empirical evidence on Italian manufacturing SMEs. Intervento presentato a: CARMA-International Conference on Advanced Research Methods and Analytics, Universidad de Sevilla.
Can websites reveal a firm’s innovativeness? Empirical evidence on Italian manufacturing SMEs
Bottai, C;Guerzoni, M;Liberati, C
2023
Abstract
Research in innovation usually builds on conventional data such as balance sheets, surveys, patents, or product catalogs. This paper intends to explore unconventional data, specifically web-scraped data, as an information source for innovation studies, proposing a careful procedure to establish the veracity of the linkage between web-based data and firm-level information retrieved from conventional sources. The study regards a sample of Italian manufacturing small and medium enterprises active in 2016, comprehending both innovative and non-innovative firms. It is based on HTML tags, whilst most of the previous literature worked on the web-pages text and related semantics. Our paper provides evidence that the way HTML language is applied to build a corporate website unveils the capabilities of the owner firm, helping to distinguish innovative from non-innovative SMEs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.