We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is twofold: (i) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters, (ii) we formulate a linear programming model able to minimize cloud resources costs and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees. Simulation results show how the execution time of MapReduce jobs falls within 14% of our upper bound on average. Moreover, numerical analyses demonstrate that our method is able to determine the global optimal solution of the linear problem for systems including up to 1,000 user classes in less than 0.5 seconds.

Malekimajd, M., Ardagna, D., Ciavotta, M., Rizzi, A., Passacantando, M. (2015). Optimal map reduce job capacity allocation in cloud systems. ACM SIGMETRICS PERFORMANCE EVALUATION REVIEW, 42(4), 51-61 [10.1145/2788402.2788410].

Optimal map reduce job capacity allocation in cloud systems

Ciavotta, M;Passacantando, M
2015

Abstract

We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is twofold: (i) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters, (ii) we formulate a linear programming model able to minimize cloud resources costs and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees. Simulation results show how the execution time of MapReduce jobs falls within 14% of our upper bound on average. Moreover, numerical analyses demonstrate that our method is able to determine the global optimal solution of the linear problem for systems including up to 1,000 user classes in less than 0.5 seconds.
Articolo in rivista - Articolo scientifico
Cloud systems; Big Data; Capacity allocation
English
51
61
11
Malekimajd, M., Ardagna, D., Ciavotta, M., Rizzi, A., Passacantando, M. (2015). Optimal map reduce job capacity allocation in cloud systems. ACM SIGMETRICS PERFORMANCE EVALUATION REVIEW, 42(4), 51-61 [10.1145/2788402.2788410].
Malekimajd, M; Ardagna, D; Ciavotta, M; Rizzi, A; Passacantando, M
File in questo prodotto:
File Dimensione Formato  
PER.pdf

Solo gestori archivio

Dimensione 818.57 kB
Formato Adobe PDF
818.57 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/186601
Citazioni
  • Scopus 17
  • ???jsp.display-item.citation.isi??? ND
Social impact