Background. Several recent software engineering studies use data mined from the version control systems adopted by the different software projects. However, inspecting the data and statistical methods used in those studies reveals several problems with the current approach, mainly related to the dependent nature of the data. Objective. We analyzed time-dependent data in software engineering at commit level, and propose an alternative approach based on time series analysis. Method. We identified statistical tests designed for time series analysis and propose a technique to model time dependent data, similarly to what is done in finance and weather forecasting. We applied our approach to a small set of projects of different sizes, investigating the behaviour of the SQALE Index, in order to highlight the time and interdependency of the different commits. Results. Using these techniques, we analysed and model the data, showing that it is possible to investigate this type of commit data using methods from time series analysis. Conclusion. Based on the promising results, we plan to validate the robustness of the approach by replicating previous works.
Saarimaki, N., Moreschini, S., Lomio, F., Penaloza, R., Lenarduzzi, V. (2022). Towards a Robust Approach to Analyze Time-Dependent Data in Software Engineering. In Proceedings - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022 (pp.36-40). IEEE COMPUTER SOC [10.1109/SANER53432.2022.00015].
Towards a Robust Approach to Analyze Time-Dependent Data in Software Engineering
Penaloza R.;
2022
Abstract
Background. Several recent software engineering studies use data mined from the version control systems adopted by the different software projects. However, inspecting the data and statistical methods used in those studies reveals several problems with the current approach, mainly related to the dependent nature of the data. Objective. We analyzed time-dependent data in software engineering at commit level, and propose an alternative approach based on time series analysis. Method. We identified statistical tests designed for time series analysis and propose a technique to model time dependent data, similarly to what is done in finance and weather forecasting. We applied our approach to a small set of projects of different sizes, investigating the behaviour of the SQALE Index, in order to highlight the time and interdependency of the different commits. Results. Using these techniques, we analysed and model the data, showing that it is possible to investigate this type of commit data using methods from time series analysis. Conclusion. Based on the promising results, we plan to validate the robustness of the approach by replicating previous works.File | Dimensione | Formato | |
---|---|---|---|
Saarimaki-2022-SANER-VoR.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Tutti i diritti riservati
Dimensione
1.34 MB
Formato
Adobe PDF
|
1.34 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.