While many companies are currently leveraging on Cloud, data centres and specialized hardware (e.g., GPUs and TPUs) to train very accurate Machine Learning models, the need to deploy and run these models on tiny devices is emerging as the most relevant challenge, with a massive untapped market. Although Automated Machine Learning and Neural Architecture Search frameworks are successfully used to find accurate models by trying a small number of alternatives, they are typically performed on large computational platforms and they cannot directly deal with deployability, leading to an accurate model which could result undeployable on a tiny device. To bridge the gap between these two worlds, we present an approach extending these frameworks to include the constraints related to the limited hardware resources of the tiny device which the trained model has to run on. Experimental results on two benchmark classification tasks and two microcontrollers prove that our AutoTinyML framework can efficiently identify models which are both accurate and deployable, in case accepting a reasonable reduction in accuracy compared to a significant reduction in hardware usages, without applying any quantization techniques of the model.
Perego, R., Candelieri, A., Archetti, F., Pau, D. (2022). AutoTinyML for microcontrollers: Dealing with black-box deployability. EXPERT SYSTEMS WITH APPLICATIONS, 207 [10.1016/j.eswa.2022.117876].
AutoTinyML for microcontrollers: Dealing with black-box deployability
Perego R.
;Candelieri A.;Archetti F.;
2022
Abstract
While many companies are currently leveraging on Cloud, data centres and specialized hardware (e.g., GPUs and TPUs) to train very accurate Machine Learning models, the need to deploy and run these models on tiny devices is emerging as the most relevant challenge, with a massive untapped market. Although Automated Machine Learning and Neural Architecture Search frameworks are successfully used to find accurate models by trying a small number of alternatives, they are typically performed on large computational platforms and they cannot directly deal with deployability, leading to an accurate model which could result undeployable on a tiny device. To bridge the gap between these two worlds, we present an approach extending these frameworks to include the constraints related to the limited hardware resources of the tiny device which the trained model has to run on. Experimental results on two benchmark classification tasks and two microcontrollers prove that our AutoTinyML framework can efficiently identify models which are both accurate and deployable, in case accepting a reasonable reduction in accuracy compared to a significant reduction in hardware usages, without applying any quantization techniques of the model.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.