Over the last few years, there has been an increasing di↵usion of Linked Data as a standard way to publish interlinked structured data on the Web, which allows users, and public and private organizations to fully exploit a large amount of data from several domains that were not available in the past. Although gathering and publishing such massive amount of structured data is certainly a step in the right direction, quality still poses a significant obstacle to the uptake of data consumption applications at large-scale. A crucial aspect of quality regards the dynamic nature of Linked Data where information can change rapidly and fail to reflect changes in the real world, thus becoming out-date. Quality is characterised by di↵erent dimensions that capture several aspects of quality such as accuracy, currency, consistency or completeness. In particular, the aspects of Linked Data dynamicity are captured by Time-Related Quality Dimen- sions such as data currency. The assessment of Time-Related Quality Dimensions, which is the task of measuring the quality, is based on temporal information whose collection poses several challenges regarding their availability, representation and diversity in Linked Data. The assessment of Time-Related Quality Dimensions supports data consumers in their decisions whether information are valid or not. The main goal of this thesis is to develop techniques for assessing Time-Related Quality Dimensions in Linked Data, which must overcome several challenges posed by Linked Data such as third-party applications, variety of data, high volume of data or velocity of data. The major contributions of this thesis can be summarized as follows: it presents a general settings of definitions for quality dimensions and measures adopted in Linked Data; it provides a large-scale analysis of approaches for representing temporal information in Linked Data; it provides a sharable and interoperable conceptual model which integrates vocabularies used to represent temporal information required for the assessment of Time-Related Quality Di- mensions; it proposes two domain-independent techniques to assess data currency that work with incomplete or inaccurate temporal information and finally it pro- vides an approach that enrich information with time intervals representing their temporal validity.
(2014). Time-related quality dimensions in linked data. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2014).
Time-related quality dimensions in linked data
RULA, ANISA
2014
Abstract
Over the last few years, there has been an increasing di↵usion of Linked Data as a standard way to publish interlinked structured data on the Web, which allows users, and public and private organizations to fully exploit a large amount of data from several domains that were not available in the past. Although gathering and publishing such massive amount of structured data is certainly a step in the right direction, quality still poses a significant obstacle to the uptake of data consumption applications at large-scale. A crucial aspect of quality regards the dynamic nature of Linked Data where information can change rapidly and fail to reflect changes in the real world, thus becoming out-date. Quality is characterised by di↵erent dimensions that capture several aspects of quality such as accuracy, currency, consistency or completeness. In particular, the aspects of Linked Data dynamicity are captured by Time-Related Quality Dimen- sions such as data currency. The assessment of Time-Related Quality Dimensions, which is the task of measuring the quality, is based on temporal information whose collection poses several challenges regarding their availability, representation and diversity in Linked Data. The assessment of Time-Related Quality Dimensions supports data consumers in their decisions whether information are valid or not. The main goal of this thesis is to develop techniques for assessing Time-Related Quality Dimensions in Linked Data, which must overcome several challenges posed by Linked Data such as third-party applications, variety of data, high volume of data or velocity of data. The major contributions of this thesis can be summarized as follows: it presents a general settings of definitions for quality dimensions and measures adopted in Linked Data; it provides a large-scale analysis of approaches for representing temporal information in Linked Data; it provides a sharable and interoperable conceptual model which integrates vocabularies used to represent temporal information required for the assessment of Time-Related Quality Di- mensions; it proposes two domain-independent techniques to assess data currency that work with incomplete or inaccurate temporal information and finally it pro- vides an approach that enrich information with time intervals representing their temporal validity.File | Dimensione | Formato | |
---|---|---|---|
phd_unimib_060772.pdf
accesso aperto
Descrizione: Tesi dottorato
Tipologia di allegato:
Doctoral thesis
Dimensione
2.73 MB
Formato
Adobe PDF
|
2.73 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.