This study focuses on the segmentation of handwritten ink in historical documents using hyperspectral imaging in two spectral ranges (visible and near-infrared). Binarization is useful as a pre-processing step for material identification using the reflectance spectra. To showcase the challenges of using hyperspectral imaging, classical single-band (Howe and Sauvola) and deep learning-based algorithms (DeepLabv3, SAM, DINOv2) are compared. For algorithms that take a single image as input, a procedure is presented to select the optimal band for binarization. The deep learning-based semantic segmentation algorithm DeepLabv3 uses the full spectrum instead. A hyperspectral database encompassing 226 samples is introduced as a benchmark to compare the performance of the algorithms. The study also introduces a novel semi-automatic method for generating ground truths, which are needed for computing performance metrics. DeepLabv3 performs on par with the best traditional algorithm in both ranges, but overall, it offers more consistent and reliable results. DINOv2 demonstrates good semantic understanding in separating foreground and background but suffers from limited spatial resolution. Conversely, SAM excels at capturing fine details but lacks the ability to identify text regions. The binarization quality obtained with three-channel images is also assessed, generally resulting in lower average performance. Our findings contribute to the advancement of technologies for the analysis of text in documents of historical interest.

Buzzelli, M., Moronta-Montero, F., Fernández-Gualda, R., López-Baldomero, A., Nieves, J., Valero, E. (2025). Handwritten ink segmentation algorithms for hyperspectral images of historical documents. MULTIMEDIA TOOLS AND APPLICATIONS, 84(32), 39551-39575 [10.1007/s11042-025-20945-5].

Handwritten ink segmentation algorithms for hyperspectral images of historical documents

Buzzelli M.
;
2025

Abstract

This study focuses on the segmentation of handwritten ink in historical documents using hyperspectral imaging in two spectral ranges (visible and near-infrared). Binarization is useful as a pre-processing step for material identification using the reflectance spectra. To showcase the challenges of using hyperspectral imaging, classical single-band (Howe and Sauvola) and deep learning-based algorithms (DeepLabv3, SAM, DINOv2) are compared. For algorithms that take a single image as input, a procedure is presented to select the optimal band for binarization. The deep learning-based semantic segmentation algorithm DeepLabv3 uses the full spectrum instead. A hyperspectral database encompassing 226 samples is introduced as a benchmark to compare the performance of the algorithms. The study also introduces a novel semi-automatic method for generating ground truths, which are needed for computing performance metrics. DeepLabv3 performs on par with the best traditional algorithm in both ranges, but overall, it offers more consistent and reliable results. DINOv2 demonstrates good semantic understanding in separating foreground and background but suffers from limited spatial resolution. Conversely, SAM excels at capturing fine details but lacks the ability to identify text regions. The binarization quality obtained with three-channel images is also assessed, generally resulting in lower average performance. Our findings contribute to the advancement of technologies for the analysis of text in documents of historical interest.
Articolo in rivista - Articolo scientifico
Historical documents; Hyperspectral imaging; Image binarization;
English
5-giu-2025
2025
84
32
39551
39575
reserved
Buzzelli, M., Moronta-Montero, F., Fernández-Gualda, R., López-Baldomero, A., Nieves, J., Valero, E. (2025). Handwritten ink segmentation algorithms for hyperspectral images of historical documents. MULTIMEDIA TOOLS AND APPLICATIONS, 84(32), 39551-39575 [10.1007/s11042-025-20945-5].
File in questo prodotto:
File Dimensione Formato  
Buzzelli et al-2025-Multimed Tools Appl-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/588442
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
Social impact