Architectural smells (AS) are design flaws in software systems that, if left unaddressed, can negatively impact maintainability and system evolution. This preliminary study investigates the use of Large Language Models (LLM) for detecting and explaining an AS called Hub-like Dependency (HL), a critical smell type characterized by components with numerous incoming and outgoing dependencies. The research leverages Google’s Gemini 1.5 Pro, comparing its performance to Arcan, a specialized AS detection tool. The study analyzes 135 AS across 39 open-source Java projects, including 100 Hub-like Dependency smells with varying severity levels and 35 non-Hub-like Dependency smells. Results show that the LLM achieves 100% recall but varying precision, with more detailed prompts improving detection performance from 64% to 82% for lower-severity smells. However, the model’s ability to generate human-understandable explanations remains limited, with only 49% of the generated explanations rated as satisfactory. These findings highlight both the potential and current limitations of one specific LLM in HL smell detection, suggesting the importance of prompt design in enhancing its capabilities.
Tessa, C., Bochicchio, M., Fontana, F. (2026). Exploring Architectural Smells Detection Through LLMs. In Software Architecture 19th European Conference, ECSA 2025, Limassol, Cyprus, September 15–19, 2025, Proceedings (pp.90-98). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-032-02138-0_6].
Exploring Architectural Smells Detection Through LLMs
Bochicchio, Matteo
Secondo
;Fontana, Francesca ArcelliUltimo
2026
Abstract
Architectural smells (AS) are design flaws in software systems that, if left unaddressed, can negatively impact maintainability and system evolution. This preliminary study investigates the use of Large Language Models (LLM) for detecting and explaining an AS called Hub-like Dependency (HL), a critical smell type characterized by components with numerous incoming and outgoing dependencies. The research leverages Google’s Gemini 1.5 Pro, comparing its performance to Arcan, a specialized AS detection tool. The study analyzes 135 AS across 39 open-source Java projects, including 100 Hub-like Dependency smells with varying severity levels and 35 non-Hub-like Dependency smells. Results show that the LLM achieves 100% recall but varying precision, with more detailed prompts improving detection performance from 64% to 82% for lower-severity smells. However, the model’s ability to generate human-understandable explanations remains limited, with only 49% of the generated explanations rated as satisfactory. These findings highlight both the potential and current limitations of one specific LLM in HL smell detection, suggesting the importance of prompt design in enhancing its capabilities.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tessa-2025-ECSA-VoR.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Tutti i diritti riservati
Dimensione
241.97 kB
Formato
Adobe PDF
|
241.97 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


