PosEiDon: Platform for Explainable Distributed Infrastructure

Project Description

PosEiDon: Platform for Explainable Distributed Infrastructure

PosEiDon aims to advance the knowledge of how simulation and machine learning (ML) methodologies can be harnessed and amplified to improve DOE’s computational and data science. PosEiDon will provide an integrated platform that helps facility operators and scientists improve the overall end-to-end science workflow by (1) predicting the performance of complex workflows; (2) detecting and classifying infrastructure and workflow anomalies and “explaining” the sources of these anomalies; and (3) suggesting performance optimizations.

Partners: University of Southern California (Lead), Lawrence Berkeley Laboratory, Argonne National Laboratory, RENCI

Funding: US Department of Energy