Artificial Intelligence (AI) is a transformative force driving innovation, yet tracking AI-related advancements remains challenging due to the rapid pace of development and unstructured data from platforms like GitHub. This paper proposes an AI-driven approach to innovation detection, leveraging GitHub as a data source to systematically identify and link AI projects to organizations. Key contributions include a domain-specific taxonomy comprising 7,490 AI topics, a modular pipeline for semantic annotation and entity linking, and a trend detection framework based on Singular Spectrum Analysis (SSA). A knowledge graph is constructed to represent relationships among AI topics, projects, and companies, thereby enabling structured innovation tracking. The approach addresses challenges such as data sparsity and noise, demonstrating strengths in semantic annotation and topic categorization. Results highlight the potential for accurately detecting AI innovations and linking them to organizational entities, offering valuable insights for researchers, companies, and policymakers. This work contributes a scalable, automated approach for AI innovation tracking, with future directions focusing on refining entity linking and expanding the knowledge graph to capture emerging trends.

Novalija, I., Roman, D., Belotti, F., Alexiev, V., Rei, L., Avogadro, R., et al. (2025). From Code to Concept: A Semantic Approach to AI Innovation Discovery in Open Source Software Repositories. IEEE ACCESS, 13, 129995-130014 [10.1109/ACCESS.2025.3590135].

From Code to Concept: A Semantic Approach to AI Innovation Discovery in Open Source Software Repositories

Belotti F.;Palmonari M.
2025

Abstract

Artificial Intelligence (AI) is a transformative force driving innovation, yet tracking AI-related advancements remains challenging due to the rapid pace of development and unstructured data from platforms like GitHub. This paper proposes an AI-driven approach to innovation detection, leveraging GitHub as a data source to systematically identify and link AI projects to organizations. Key contributions include a domain-specific taxonomy comprising 7,490 AI topics, a modular pipeline for semantic annotation and entity linking, and a trend detection framework based on Singular Spectrum Analysis (SSA). A knowledge graph is constructed to represent relationships among AI topics, projects, and companies, thereby enabling structured innovation tracking. The approach addresses challenges such as data sparsity and noise, demonstrating strengths in semantic annotation and topic categorization. Results highlight the potential for accurately detecting AI innovations and linking them to organizational entities, offering valuable insights for researchers, companies, and policymakers. This work contributes a scalable, automated approach for AI innovation tracking, with future directions focusing on refining entity linking and expanding the knowledge graph to capture emerging trends.
Articolo in rivista - Articolo scientifico
Artificial intelligence; data mining; text mining; time series analysis;
English
17-lug-2025
2025
13
129995
130014
open
Novalija, I., Roman, D., Belotti, F., Alexiev, V., Rei, L., Avogadro, R., et al. (2025). From Code to Concept: A Semantic Approach to AI Innovation Discovery in Open Source Software Repositories. IEEE ACCESS, 13, 129995-130014 [10.1109/ACCESS.2025.3590135].
File in questo prodotto:
File Dimensione Formato  
Novalija et al-2025-IEEE Access-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 3.8 MB
Formato Adobe PDF
3.8 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/583202
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
Social impact