Automating Edge-to-cloud Workflows for Science: Traversing the Edge-to-cloud Continuum with Pegasus

2nd ACM/IEEE International workshop on Cloud-to-Things continuum: towards the convergence of IoT, Edge and Cloud Computing (Cloud2Things 2022), Taormina, Messina Italy, May 2022. (Best Paper Award)

Abstract

High-performance computing (HPC) systems, and more recently clouds, have been the standard execution platforms for large scale scientific workflows. The development of workflow management system (WMS) software, which orchestrates the execution of such workflows, has focused on enabling automated and efficient execution on these traditional platforms. The recent emergence of the Internet of Things (IoT) and of applications, which demand low latency, big data processing capabilities, and increased privacy constraints has heralded a new computing paradigm, namely edge computing. In order to enable scientists to easily incorporate edge resources into their computational workflows, WMS need to extend their capabilities in terms of scheduling jobs to these new resources and in terms of moving data from and to the edge. In this paper, we describe how we extended the Pegasus Workflow Management System to support edge-to-cloud workflows in an automated fashion. We discuss how Pegasus and HTCondor (its job scheduler) work together to enable this automation. We use HTCondor to form heterogeneous pools of compute resources and Pegasus to plan the workflow onto these resources and manage containers and data movement for executing workflows in hybrid edge-cloud environments. We then show how Pegasus can be used to evaluate the execution of workflows running on edge only, cloud only, and edge-cloud hybrid environments. Using the Chameleon Cloud testbed to set up and configure an edge-cloud environment, we use Pegasus to benchmark the executions of one synthetic workflow and two production workflows: CASA-Wind and the Ocean Observatories Initiative Orcasound workflow, all of which derive their data from edge devices. We present the performance impact on workflow runs of job and data placement strategies employed by Pegasus when configured to run in the above three execution environments. Results show that the synthetic workflow performs best in an edge only environment, while the CASA-Wind and Orcasound workflows see significant improvements in overall makespan when run in a cloud only environment. The results demonstrate that Pegasus can be used to automate edge-to-cloud science workflows and the workflow provenance data collection capabilities of the Pegasus monitoring daemon enable computer scientists to conduct edge-to-cloud research.