Deep learning (DL) methods have recently gained popularity and been used in commonplace applications; voice and face recognition, among the others. Despite the growing popularity of DL and the associated hardware acceleration techniques, GPU-based systems still have very high costs. Moreover, while the cloud represents a cost-effective and flexible solution, in large settings operations costs can be further optimized by carefully managing and fostering resource sharing. This work addresses the online joint problem of capacity planning of virtual machines (VMs) and DL training jobs scheduling, and proposes a Mixed Integer Linear Programming (MILP) formulation. In particular, DL jobs are assumed to feature a deadline, while multiple VM types are available from a cloud provider catalog, and each VM has, possibly, multiple GPUs. Our solutions optimize the operations costs by (i) right-sizing the VM capacities; (ii) partitioning the set of GPUs among multiple concurrent jobs running on the same VM, and (iii) determining a deadline-Aware job schedule. Our approach is evaluated using an ad-hoc simulator and a prototype environment, and compared against first-principle approaches, resulting in a cost reduction of 45-80%.

Jahani, A., Lattuada, M., Ciavotta, M., Ardagna, D., Amaldi, E., Zhang, L. (2019). Optimizing on-demand GPUs in the Cloud for Deep Learning Applications Training. In 2019 4th International Conference on Computing, Communications and Security, ICCCS 2019 (pp.1-8). Institute of Electrical and Electronics Engineers Inc. [10.1109/CCCS.2019.8888151].

Optimizing on-demand GPUs in the Cloud for Deep Learning Applications Training

Lattuada M.;Ciavotta M.;
2019

Abstract

Deep learning (DL) methods have recently gained popularity and been used in commonplace applications; voice and face recognition, among the others. Despite the growing popularity of DL and the associated hardware acceleration techniques, GPU-based systems still have very high costs. Moreover, while the cloud represents a cost-effective and flexible solution, in large settings operations costs can be further optimized by carefully managing and fostering resource sharing. This work addresses the online joint problem of capacity planning of virtual machines (VMs) and DL training jobs scheduling, and proposes a Mixed Integer Linear Programming (MILP) formulation. In particular, DL jobs are assumed to feature a deadline, while multiple VM types are available from a cloud provider catalog, and each VM has, possibly, multiple GPUs. Our solutions optimize the operations costs by (i) right-sizing the VM capacities; (ii) partitioning the set of GPUs among multiple concurrent jobs running on the same VM, and (iii) determining a deadline-Aware job schedule. Our approach is evaluated using an ad-hoc simulator and a prototype environment, and compared against first-principle approaches, resulting in a cost reduction of 45-80%.
paper
Cloud; on-demand GPUs; Optimization models; Scheduling
English
4th International Conference on Computing, Communications and Security, ICCCS 2019
2019
2019 4th International Conference on Computing, Communications and Security, ICCCS 2019
978-172810875-9
2019
1
8
8888151
none
Jahani, A., Lattuada, M., Ciavotta, M., Ardagna, D., Amaldi, E., Zhang, L. (2019). Optimizing on-demand GPUs in the Cloud for Deep Learning Applications Training. In 2019 4th International Conference on Computing, Communications and Security, ICCCS 2019 (pp.1-8). Institute of Electrical and Electronics Engineers Inc. [10.1109/CCCS.2019.8888151].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/327734
Citazioni
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
Social impact