Deterministic Finite Wheeler Automata are a natural generalisation to regular languages of the theory of compressed data structures originated by the introduction of the Burrows-Wheeler transform. Indeed, if we can find a Wheeler automaton recognizing a given language L, such automaton can be used to design time and space efficient algorithms for representing and searching L. In this paper we introduce an alternative representation of Deterministic Wheeler Automata by showing that a natural map between strings and rational numbers in Qr0, 1q can be extended to represent the automaton’s states as intervals in Qr0, 1q. Using this representation it emerges a natural relationship between automata properties and some properties of real numbers. In addition, such representation enables us to formulate problems related to automata in a numerical setting. Although at the moment the numerical approach does not lead to time efficient algorithms, we believe this new perspective deserves further consideration. As a further demonstration of the convenience of this new representation, we use it to provide a simple proof of an unexpected result on regular languages. More precisely, we compare the size of the smallest Wheeler automaton recognizing a given language L with respect to the size of the smallest automaton, possibly non-Wheeler, recognizing the same language. We show settings in which there can be an exponential gap between the two sizes, and we discuss the implications of this result on the problem of representing regular languages.
Manzini, G., Policriti, A., Prezza, N., Riccardi, B. (2024). The Rational Construction of a Wheeler DFA. In 35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024 (pp.1). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2024.23].
The Rational Construction of a Wheeler DFA
Manzini G.;Riccardi B.
2024
Abstract
Deterministic Finite Wheeler Automata are a natural generalisation to regular languages of the theory of compressed data structures originated by the introduction of the Burrows-Wheeler transform. Indeed, if we can find a Wheeler automaton recognizing a given language L, such automaton can be used to design time and space efficient algorithms for representing and searching L. In this paper we introduce an alternative representation of Deterministic Wheeler Automata by showing that a natural map between strings and rational numbers in Qr0, 1q can be extended to represent the automaton’s states as intervals in Qr0, 1q. Using this representation it emerges a natural relationship between automata properties and some properties of real numbers. In addition, such representation enables us to formulate problems related to automata in a numerical setting. Although at the moment the numerical approach does not lead to time efficient algorithms, we believe this new perspective deserves further consideration. As a further demonstration of the convenience of this new representation, we use it to provide a simple proof of an unexpected result on regular languages. More precisely, we compare the size of the smallest Wheeler automaton recognizing a given language L with respect to the size of the smallest automaton, possibly non-Wheeler, recognizing the same language. We show settings in which there can be an exponential gap between the two sizes, and we discuss the implications of this result on the problem of representing regular languages.File | Dimensione | Formato | |
---|---|---|---|
Manzini-2024-LIPIcs-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
883.62 kB
Formato
Adobe PDF
|
883.62 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.