This study aims to critically discuss the data quality and practical utility of open-ended questions for classifying organisational economic activities in survey research. The research addresses two primary questions. First, it examines whether the length of responses to open-ended questions about the economic activities of organisations is associated with the semantic quality and richness of the information provided. Specifically, it investigates whether longer responses can be considered a meaningful indicator of the overall quality and informativeness of the data collected. Manual coding was used to derive a dichotomous variable representing response quality. Second, the study makes use of Structural Topic Modeling (STM) to analyse whether the length of responses varies systematically across different sectors of economic activities. The main findings indicate that the length of open-ended responses is not always correlated with high semantic quality. Both manual coding and STM analysis reveal that complex organisational activities tend to generate longer answers to provide better descriptions, while simpler ones can be limited in terms of characters. The study contributes to survey methodology, showing how open-ended questions are effectively able to capture nuanced organisational practices without raising respondent’s burden, thus enriching the precision and flexibility of economic classification. A key practical implication of these findings is that it could be effective to impose a relatively low word limit on open-ended questions about organisational activities, without impacting significantly the response quality.
Paleardi, F., Novello, M. (2025). From words to categories: data quality and the methodological value of open-ended questions in classifying economic activities. QUALITY & QUANTITY [10.1007/s11135-025-02348-8].
From words to categories: data quality and the methodological value of open-ended questions in classifying economic activities
Paleardi, Federico
;
2025
Abstract
This study aims to critically discuss the data quality and practical utility of open-ended questions for classifying organisational economic activities in survey research. The research addresses two primary questions. First, it examines whether the length of responses to open-ended questions about the economic activities of organisations is associated with the semantic quality and richness of the information provided. Specifically, it investigates whether longer responses can be considered a meaningful indicator of the overall quality and informativeness of the data collected. Manual coding was used to derive a dichotomous variable representing response quality. Second, the study makes use of Structural Topic Modeling (STM) to analyse whether the length of responses varies systematically across different sectors of economic activities. The main findings indicate that the length of open-ended responses is not always correlated with high semantic quality. Both manual coding and STM analysis reveal that complex organisational activities tend to generate longer answers to provide better descriptions, while simpler ones can be limited in terms of characters. The study contributes to survey methodology, showing how open-ended questions are effectively able to capture nuanced organisational practices without raising respondent’s burden, thus enriching the precision and flexibility of economic classification. A key practical implication of these findings is that it could be effective to impose a relatively low word limit on open-ended questions about organisational activities, without impacting significantly the response quality.| File | Dimensione | Formato | |
|---|---|---|---|
|
Paleardi-Novello-2025-Qual Quant-VoR.pdf
accesso aperto
Descrizione: Open Access This article is licensed under a Creative Commons Attribution 4.0 International License
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1.39 MB
Formato
Adobe PDF
|
1.39 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


