Evaluating the effectiveness of replication for tail-tolerance
Computing clusters (CC) are a cost-effective high-performance platform for computation-intensive scientific and engineering applications. A key challenge in managing CCs is to consistently achieve low response times. In particular, tail-tolerant methods aim to keep the tail of the response-time dist...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2015
- Institución:
- Universidad del Rosario
- Repositorio:
- Repositorio EdocUR - U. Rosario
- Idioma:
- eng
- OAI Identifier:
- oai:repository.urosario.edu.co:10336/28509
- Acceso en línea:
- https://doi.org/10.1109/CCGrid.2015.22
https://repository.urosario.edu.co/handle/10336/28509
- Palabra clave:
- Computing clusters
Concurrent replication
Computer system
- Rights
- License
- Restringido (Acceso a grupos específicos)
id |
EDOCUR2_1257504e5b0d7b2cb077d09c7afa5c44 |
---|---|
oai_identifier_str |
oai:repository.urosario.edu.co:10336/28509 |
network_acronym_str |
EDOCUR2 |
network_name_str |
Repositorio EdocUR - U. Rosario |
repository_id_str |
|
spelling |
e1d48e1f-f195-4e4b-9c1b-f62c730d04e5-1800352026002020-08-28T15:49:15Z2020-08-28T15:49:15Z2015-05Computing clusters (CC) are a cost-effective high-performance platform for computation-intensive scientific and engineering applications. A key challenge in managing CCs is to consistently achieve low response times. In particular, tail-tolerant methods aim to keep the tail of the response-time distribution short. In this paper we explore concurrent replication with canceling, a tail-tolerant approach that involves processing requests and their replicas concurrently, retrieving the result from the first replica that completes, and canceling all other replicas. We propose a stochastic model that considers any number of replicas, general processing and inter-arrival times, and computes the response time distribution. We show that replication can be very effective in keeping the response-time tail short, but these benefits highly depend on the processing-time distribution, as well as on the CC utilization and the statistical characteristics of the arrival process. We also exploit the model to support the selection of the optimal number of replicas, and a resource provisioning strategy that meets service-level objectives on the response-time percentiles.application/pdfhttps://doi.org/10.1109/CCGrid.2015.22ISBN: 978-1-4799-8006-2https://repository.urosario.edu.co/handle/10336/28509engIEEE452443CCGrid `15: 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Shenzhen China (May, 2015)Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, ISBN: 978-1-4799-8006-2 (May 2015); pp. 443-452https://dl.acm.org/doi/abs/10.1109/CCGrid.2015.22Restringido (Acceso a grupos específicos)http://purl.org/coar/access_right/c_16ecCCGrid '15: 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Shenzhen China (May, 2015)instname:Universidad del Rosarioreponame:Repositorio Institucional EdocURComputing clustersConcurrent replicationComputer systemEvaluating the effectiveness of replication for tail-toleranceEvaluar la efectividad de la replicación para la tolerancia a la colabookPartParte de librohttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_3248Qiu, ZhanPérez, Juan F.10336/28509oai:repository.urosario.edu.co:10336/285092021-06-03 00:49:51.087https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.co |
dc.title.spa.fl_str_mv |
Evaluating the effectiveness of replication for tail-tolerance |
dc.title.TranslatedTitle.spa.fl_str_mv |
Evaluar la efectividad de la replicación para la tolerancia a la cola |
title |
Evaluating the effectiveness of replication for tail-tolerance |
spellingShingle |
Evaluating the effectiveness of replication for tail-tolerance Computing clusters Concurrent replication Computer system |
title_short |
Evaluating the effectiveness of replication for tail-tolerance |
title_full |
Evaluating the effectiveness of replication for tail-tolerance |
title_fullStr |
Evaluating the effectiveness of replication for tail-tolerance |
title_full_unstemmed |
Evaluating the effectiveness of replication for tail-tolerance |
title_sort |
Evaluating the effectiveness of replication for tail-tolerance |
dc.subject.keyword.spa.fl_str_mv |
Computing clusters Concurrent replication Computer system |
topic |
Computing clusters Concurrent replication Computer system |
description |
Computing clusters (CC) are a cost-effective high-performance platform for computation-intensive scientific and engineering applications. A key challenge in managing CCs is to consistently achieve low response times. In particular, tail-tolerant methods aim to keep the tail of the response-time distribution short. In this paper we explore concurrent replication with canceling, a tail-tolerant approach that involves processing requests and their replicas concurrently, retrieving the result from the first replica that completes, and canceling all other replicas. We propose a stochastic model that considers any number of replicas, general processing and inter-arrival times, and computes the response time distribution. We show that replication can be very effective in keeping the response-time tail short, but these benefits highly depend on the processing-time distribution, as well as on the CC utilization and the statistical characteristics of the arrival process. We also exploit the model to support the selection of the optimal number of replicas, and a resource provisioning strategy that meets service-level objectives on the response-time percentiles. |
publishDate |
2015 |
dc.date.created.spa.fl_str_mv |
2015-05 |
dc.date.accessioned.none.fl_str_mv |
2020-08-28T15:49:15Z |
dc.date.available.none.fl_str_mv |
2020-08-28T15:49:15Z |
dc.type.eng.fl_str_mv |
bookPart |
dc.type.coarversion.fl_str_mv |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_3248 |
dc.type.spa.spa.fl_str_mv |
Parte de libro |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.1109/CCGrid.2015.22 |
dc.identifier.issn.none.fl_str_mv |
ISBN: 978-1-4799-8006-2 |
dc.identifier.uri.none.fl_str_mv |
https://repository.urosario.edu.co/handle/10336/28509 |
url |
https://doi.org/10.1109/CCGrid.2015.22 https://repository.urosario.edu.co/handle/10336/28509 |
identifier_str_mv |
ISBN: 978-1-4799-8006-2 |
dc.language.iso.spa.fl_str_mv |
eng |
language |
eng |
dc.relation.citationEndPage.none.fl_str_mv |
452 |
dc.relation.citationStartPage.none.fl_str_mv |
443 |
dc.relation.citationTitle.none.fl_str_mv |
CCGrid `15: 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Shenzhen China (May, 2015) |
dc.relation.ispartof.spa.fl_str_mv |
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, ISBN: 978-1-4799-8006-2 (May 2015); pp. 443-452 |
dc.relation.uri.spa.fl_str_mv |
https://dl.acm.org/doi/abs/10.1109/CCGrid.2015.22 |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_16ec |
dc.rights.acceso.spa.fl_str_mv |
Restringido (Acceso a grupos específicos) |
rights_invalid_str_mv |
Restringido (Acceso a grupos específicos) http://purl.org/coar/access_right/c_16ec |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.spa.fl_str_mv |
IEEE |
dc.source.spa.fl_str_mv |
CCGrid '15: 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Shenzhen China (May, 2015) |
institution |
Universidad del Rosario |
dc.source.instname.none.fl_str_mv |
instname:Universidad del Rosario |
dc.source.reponame.none.fl_str_mv |
reponame:Repositorio Institucional EdocUR |
repository.name.fl_str_mv |
Repositorio institucional EdocUR |
repository.mail.fl_str_mv |
edocur@urosario.edu.co |
_version_ |
1818106473310322688 |