Consider a two-way contingency table displaying the joint distribution of two categorical variables X and Y. A frequent need in the analysis of this kind of data is to group and collapse some rows of the table, in such a way that the collapsed table reproduces the association between X and Y in the best way. To solve this problem, it is hence necessary to compute a measure of association (for instance the likelihood-ratio statistic G2) for every possible grouping of the rows of the original table. Unfortunaltely, this can hardly be done if the number of categories to be grouped is large; hence some approximate solutions must be searched (see Siatkowski and Krajewsky, 1989). We now propose a new method, based on the logic of genetic algorithms, to determine such approximations. The proposed algorithm can be applied when the number of group in the collapsed table is fixed or variable. The performance of the algorithm is tested on a dataset regarding rose varieties and reasults are compared with those obtained by Siatkowski and Krajewsky (1989)

Borroni, C., Piccarreta, R. (2000). Genetic Algorithms for Optimal Grouping of Categories in Two-way Contingency Tables. STATISTICA APPLICATA, 12(4), 435-444.

Genetic Algorithms for Optimal Grouping of Categories in Two-way Contingency Tables

BORRONI, CLAUDIO GIOVANNI;
2000

Abstract

Consider a two-way contingency table displaying the joint distribution of two categorical variables X and Y. A frequent need in the analysis of this kind of data is to group and collapse some rows of the table, in such a way that the collapsed table reproduces the association between X and Y in the best way. To solve this problem, it is hence necessary to compute a measure of association (for instance the likelihood-ratio statistic G2) for every possible grouping of the rows of the original table. Unfortunaltely, this can hardly be done if the number of categories to be grouped is large; hence some approximate solutions must be searched (see Siatkowski and Krajewsky, 1989). We now propose a new method, based on the logic of genetic algorithms, to determine such approximations. The proposed algorithm can be applied when the number of group in the collapsed table is fixed or variable. The performance of the algorithm is tested on a dataset regarding rose varieties and reasults are compared with those obtained by Siatkowski and Krajewsky (1989)
Articolo in rivista - Articolo scientifico
Scientifica
genetic algorithms, contingency tables
English
435
444
Borroni, C., Piccarreta, R. (2000). Genetic Algorithms for Optimal Grouping of Categories in Two-way Contingency Tables. STATISTICA APPLICATA, 12(4), 435-444.
Borroni, C; Piccarreta, R
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10281/3262
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact