Methodological Approach to Data-Centric Cloudific- ation of Scientific Iterative Workflows
Author(s)
Publisher
: Springer, LNCS 10048
Date issued
December 14, 2016
From page
469
To page
482
Subjects
Cloud Computing · Cloudification · Iterative workflows · Map Re- duce · Apache Spark · Hydrology · HydroGeoSphere · Ensemble Kalman filter
Abstract
The computational complexity and the constantly increas- ing amount of input data for scientific computing models is threatening their scalability. In addition, this is leading towards more data-intensive scientific computing, thus rising the need to combine techniques and in- frastructures from the HPC and big data worlds. This paper presents a methodological approach to cloudify generalist iterative scientific work- flows, with a focus on improving data locality and preserving perfor- mance. To evaluate this methodology, it was applied to an hydrologi- cal simulator, EnKF-HGS. The design was implemented using Apache Spark, and assessed in a local cluster and in Amazon Elastic Compute Cloud (EC2) against the original version to evaluate performance and scalability.
Notes
, 2016
Event name
16th International Conference on Algorithms and Architecture for Parallel Processing, ICA3PP 2016
Location
Granada, Spain
Later version
http://link.springer.com/chapter/10.1007/978-3-319-49583-5_36
Publication type
conference paper
