Domain Adaptation in Fine-grained Entity Typing

Vimercati, M

This Ph.D. thesis focuses on a subtask of entity extraction, particularly of the Classification step, named Entity Typing (ET). ET is the task of assigning types to an entity already found in a given sentence. Types have to be chosen from a given vocabulary. Fine-grained Entity Typing is the name given to an ET problem with a large amount of types (from dozens to hundreds), usually organized in a hierarchy. In this Ph.D. thesis, FET is studied under the lenses of Domain Adaptation, defining a novel task. The problem has not been addressed before in FET, since the literature is focused on proposing sophisticated encoders, classifiers, and denoising methods applied to in-domain FET. Domain Adaptation is needed when a trained FET model, able to classify mentions found in a given sentence with types from a given hierarchy, has to be used in a domain that differs from the training one. Domain adaptation concerns real use cases (model reuse, model specialization, and full-fledged domain adaptation) formally described and investigated in the thesis. Domain adaptation is generally underinvestigated in information extraction, even if the described domain adaptation scenarios are common in the design of real world information extraction systems. The adaptation from a source to a target FET domain is highly related to the types present in the source and target type hierarchies, since during the adaptation process the FET model is expected to learn the new target types by exploiting its capabilities on source types. The adaptation techniques proposed in this thesis are based on the exlpicit individuation of equivalence, generalization, specialization or disjunction relations between source and target types, such relations are injected during the adaptation process with a neurosymbolic integration approach from literature, i.e., Knowledge Enhanced Neural Network (KENN). Moreover, the approaches proposed to face domain adaptation in FET reflect the first attempt to use a neuro-symbolic integration technique in FET. The entity typing literature is reviewed highlighting additional information (type hierarchy, annotation cooccurrence, and entity linking) used by FET approaches. Domain adaptation literature is reviewed and a formal definition of domain in FET is proposed. In the experiments common FET benchmarks are used to train a FET model implementing BERT and Adapters. Then, adaptation techniques are used to adapt trained models to low-resource target domains; the experiments show that the proposed adaptation techniques successfully inject types relations during training on target domain improving performance in model specialization and full-fledged domain adaptation scenarios. In particular, the injection of equivalence between source and target types during training showed to be effective in favoring the learning of new types in full-fledged domain adaptation scenario. Moreover, a denoising technique to mitigate anntoation noise in training dataset is proposed and evaluated under in-domain and model reuse scenario.

Questa tesi di dottorato si concentra su una sottotarea dell'attività di Entity Extraction, in particolare sullo step di Classificazione, denominato Entity Typing (ET). L'ET consiste nell'assegnare tipi ad una entita' trovata in una determinata frase. I tipi devono essere scelti da un vocabolario dato. Fine-grained Entity Typing è il nome dato a un problema di ET con una grande quantità di tipi (da decine a centinaia), di solito organizzati in una gerarchia. In questa tesi di dottorato, viene studiato FET sotto il punto di vista dell'adattamento ad un nuovo dominio, definendo un nuovo task. Il problema non è stato affrontato prima in FET, poiché la letteratura si concentra sulla proposta di encoder sofisticati, classificatori e metodi di eliminazione del rumore applicati a FET mantenendo fisso il dominio. L'adattamento ad un nuovo dominio è necessario quando un modello FET addeastrato, in grado di classificare le menzioni trovate in una determinata frase con tipi estratti da una gerarchia data, deve essere utilizzato in un dominio diverso da quello di addestramento. L'adattamento ad un nuovo dominio riguarda casi d'uso reali (riutilizzo del modello, specializzazione del modello e completo adattamento al dominio) formalmente descritti e indagati nella tesi. L'adattamento ad un nuovo dominio è generalmente sottoinvestigato nell'ambito di Information Extraction, anche se gli scenario di adattamento ad un nuovo dominio sono comuni nella progettazione di sistemi di estrazione di informazioni nel mondo reale. L'adattamento da un dominio sorgente a un dominio target FET è altamente correlato ai tipi presenti nelle gerarchie di tipi sorgente e target, poiché durante il processo di adattamento il modello FET è destinato a imparare i nuovi tipi target sfruttando le sue capacità sui tipi sorgente. Le tecniche di adattamento proposte in questa tesi sono basate sulla individuazione esplicita di relazioni di equivalenza, generalizzazione, specializzazione o disgiunzione tra tipi sorgente e target, tali relazioni vengono iniettate durante il processo di adattamento con un approccio di integrazione neuro-simbolico tratto dalla letteratura, ovvero Knowledge Enhanced Neural Network (KENN). Inoltre, gli approcci proposti per affrontare l'adattadamento ad un nuovo dominio ricadono nel primo tentativo di usare una tecnica di integrazione neuro-simbolica in FET. Nella tesi di dottorato viene esaminata la letteratura su Entity Typing, evidenziando le informazioni aggiuntive (gerarchia di tipi, co-occorrenza nelle annotazioni e linking di entità) utilizzate dalle tecniche di Entity Typing. Viene anche esaminata la letteratura sull'adattamento ad un nuov dominio e proposta una definizione formale di dominio in Entity Typing. Negli esperimenti, i benchmark dataset comunemente usati nell'ambito dell'Entity Typing vengono usati per addestrare un modello di Entity Typing basato su BERT e Adapters. Successivamente, le tecniche di adattamento proposte vengono utilizzate per adattare i modelli addestrati a nuovi domini ove sono presenti risorse limitate, mostrando che le tecniche di adattamento proposte iniettano con successo le relazioni tra i tipi durante l'addestramento sul dominio di destinazione, migliorando le prestazioni nei casi di specializzazione del modello e completo adattamento al dominio. In particolare, l'iniezione della equivalenza tra i tipi sorgente e di target durante l'addestramento si è dimostrata efficace nel favorire l'apprendimento di nuovi tipi nello scenario di adattamento di dominio completo. Infine, una tecnica di mitigazione del rumore viene proposta per ridurre il rumore presente nelle annotazioni e applicata ai casi di riutilizzo del modello e di dominio fissato

(2023). Domain Adaptation in Fine-grained Entity Typing. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2023).