The task of evaluating the performance of a natural language understanding system, despite its largely recognized relevance, is still poorly defined. It mostly relies on intuitive reasoning and lacks a sound theoretical foundation. This paper sets a formal and quantitative proposal for this task. In particular, a measure of performance that allows the basic input-output characteristics of a system to be evaluated is introduced first at an abstract level. The definition of concrete measures is then obtained by assigning actual values to the functional parameters of the abstract definition; some particular cases are shown and discussed in detail. Finally, the task of measuring performance in practice is considered, and a model for experimental performance evaluation is presented. Comparison with related works is also briefly discussed; open problems and promising directions for future research are outlined. A limited case study experimentation with the model proposed is presented in the appendix.
Guida, G., Mauri, G. (1984). A formal basis for performance evaluation of natural language understanding systems. COMPUTATIONAL LINGUISTICS, 10(1), 15-30.
A formal basis for performance evaluation of natural language understanding systems
Mauri, G
1984
Abstract
The task of evaluating the performance of a natural language understanding system, despite its largely recognized relevance, is still poorly defined. It mostly relies on intuitive reasoning and lacks a sound theoretical foundation. This paper sets a formal and quantitative proposal for this task. In particular, a measure of performance that allows the basic input-output characteristics of a system to be evaluated is introduced first at an abstract level. The definition of concrete measures is then obtained by assigning actual values to the functional parameters of the abstract definition; some particular cases are shown and discussed in detail. Finally, the task of measuring performance in practice is considered, and a model for experimental performance evaluation is presented. Comparison with related works is also briefly discussed; open problems and promising directions for future research are outlined. A limited case study experimentation with the model proposed is presented in the appendix.File | Dimensione | Formato | |
---|---|---|---|
R09-CompLing84.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
1.2 MB
Formato
Adobe PDF
|
1.2 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.