ACM Third International Workshop on Network-Aware Data Management (NDM 2013), 2:1--2:10, 2013
This paper presents a performance evaluation of scientific workflows on networked cloud systems with particular emphasis on evaluating the effect of provisioned network bandwidth on application I/O performance. The experiments were run on ExoGENI, a widely distributed networked infrastructure as a service (NIaaS) testbed. ExoGENI orchestrates a federation of independent cloud sites located around the world along with backbone circuit providers. The evaluation used a representative data-intensive scientific workflow application called Montage. The application was deployed on a virtualized HTCondor environment provisioned dynamically from the ExoGENI networked cloud testbed, and managed by the Pegasus workflow manager.
The results of our experiments show the effect of modifying provisioned network bandwidth on disk I/O throughput and workflow execution time. The marginal benefit as perceived by the workflow reduces as the network bandwidth allocation increases to a point where disk I/O saturates. There is little or no benefit from increasing network bandwidth beyond this inflection point. The results also underline the importance of network and I/O performance isolation for predictable application performance, and are applicable for general data-intensive workloads. Insights from this work will also be useful for real-time monitoring, application steering and infrastructure planning for data-intensive workloads on networked cloud platforms.