On the latency-accuracy tradeoff in approximate MapReduce jobs

To ensure the scalability of big data analytics, approximate MapReduce platforms emerge to explicitly trade off accuracy for latency. A key step to determine optimal approximation levels is to capture the latency of big data jobs, which is long deemed challenging due to the complex dependency among...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2017
Institución:
Universidad del Rosario
Repositorio:
Repositorio EdocUR - U. Rosario
Idioma:
eng
OAI Identifier:
oai:repository.urosario.edu.co:10336/22862
Acceso en línea:
https://doi.org/10.1109/INFOCOM.2017.8057038
https://repository.urosario.edu.co/handle/10336/22862
Palabra clave:
Approximation theory
Economic and social effects
Stochastic models
Stochastic systems
Data analytics
Data platform
Job scheduling policies
Map-reduce
Matrix analytic methods
Optimal approximation
Performance Gain
Wide spectrum
Big data
Rights
License
http://purl.org/coar/access_right/c_abf2
Description
Summary:To ensure the scalability of big data analytics, approximate MapReduce platforms emerge to explicitly trade off accuracy for latency. A key step to determine optimal approximation levels is to capture the latency of big data jobs, which is long deemed challenging due to the complex dependency among data inputs and map/reduce tasks. In this paper, we use matrix analytic methods to derive stochastic models that can predict a wide spectrum of latency metrics, e.g., average, tails, and distributions, for approximate MapReduce jobs that are subject to strategies of input sampling and task dropping. In addition to capturing the dependency among waves of map/reduce tasks, our models incorporate two job scheduling policies, namely, exclusive and overlapping, and two task dropping strategies, namely, early and straggler, enabling us to realistically evaluate the potential performance gains of approximate computing. Our numerical analysis shows that the proposed models can guide big data platforms to determine the optimal approximation strategies and degrees of approximation. © 2017 IEEE.