Virtualization technologies have enabled a new way of thinking of computing resources and cloud computing frameworks offer many pay-per-use solutions for renting these resources. Conventional physical servers had to be acquired, provisioned, and configured beforehand; virtual resources can be allocated on demand, and changes can be managed quickly. Deploying systems on virtualized resources allows one to allocate resources given the actual workload and KPIs of interest, but it requires that resource management be part of the system itself. Traditional application components must be augmented with probes and actuators to sense the application behavior and provision resources accordingly. Big data applications are a prominent example of these modern systems, and the paper discusses dynaSpark, that is, the work done by the authors to extend Spark standalone-A well-known framework widely used for parallel processing and big data applications-And augment it with resource management capabilities. It also introduces the key problems the integration and the particular batch applications bring in, and identifies additional aspects that are still to be taken into account and that would lead to a better solution.
Baresi, L., Denaro, G., Quattrocchi, G. (2019). Big-data applications as self-Adaptive systems of systems. In Proceedings - 2019 IEEE 30th International Symposium on Software Reliability Engineering Workshops, ISSREW 2019 (pp.155-162). 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/ISSREW.2019.00066].
Big-data applications as self-Adaptive systems of systems
Denaro G.;
2019
Abstract
Virtualization technologies have enabled a new way of thinking of computing resources and cloud computing frameworks offer many pay-per-use solutions for renting these resources. Conventional physical servers had to be acquired, provisioned, and configured beforehand; virtual resources can be allocated on demand, and changes can be managed quickly. Deploying systems on virtualized resources allows one to allocate resources given the actual workload and KPIs of interest, but it requires that resource management be part of the system itself. Traditional application components must be augmented with probes and actuators to sense the application behavior and provision resources accordingly. Big data applications are a prominent example of these modern systems, and the paper discusses dynaSpark, that is, the work done by the authors to extend Spark standalone-A well-known framework widely used for parallel processing and big data applications-And augment it with resource management capabilities. It also introduces the key problems the integration and the particular batch applications bring in, and identifies additional aspects that are still to be taken into account and that would lead to a better solution.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.