ACM Practice and Experience in Advanced Research Computing (PEARC 2021) Conference, Virtual, July 2021.
Abstract
Fundamental progress towards reliable modern science depends on accurate anomaly detection during application execution. In this paper, we suggest a novel approach to tackle this problem by applying Convolutional Neural Network (CNN) classification methods to high-resolution visualizations that capture the end-to-end workflow execution timeline. Subtle differences in the timeline reveal information about the performance of the application and infrastructure’s components. We collect 1000 traces of a scientific workflow’s executions. We explore and evaluate the performance of CNNs trained from scratch and pre-trained on ImageNet. Our initial results are promising with over 90% accuracy.