Options
Lessons Learned from Applying Big Data Paradigms to a Large Scale Scientific Workflow
Auteur(s)
Maison d'édition
: CEUR-WS.org
Date de parution
2016-11-14
De la page
54
A la page
58
Résumé
The increasing amount of data related to the execution of scientific workflows has raised awareness of their shift towards parallel data-intensive problems. In this paper, we deliver our experience with combining the traditional high-performance computing and grid-based approaches for scientific workflows, with Big Data analytics paradigms. Our goal was to assess and discuss the suitability of such data-intensive-oriented mechanisms for production-ready workflows, especially in terms of scalability, focusing on a key element in the Big Data ecosystem: the data-centric programming model. Hence, we reproduced the functionality of a MPI-based iterative workflow from the hydrology domain, EnKF-HGS, using the Spark data analysis framework. We conducted experiments on a local cluster, and we relied on our results to discuss promising directions for further research.
Notes
, 2016
Nom de l'événement
11th Workshop on Workflows in Support of Large-Scale Science, Supercomputing
Lieu
Salt Lake City
Identifiants
Autre version
http://ceur-ws.org/Vol-1800/short1.pdf
Type de publication
conference paper