SciWiT: Top Down Approach to Implementing Scientific Workflows in Transit using In-Network And Near-Network Resources

Project Description

SciWiT: Top Down Approach to Implementing Scientific Workflows in Transit using In-Network And Near-Network Resources

Scientific data analysis is a large-scale process that involves instruments generating data at one site and networks moving the data to a high-performance computing facility where the analysis happens. Programmable networks are capable of reading the data inside data packets, opening the possibility of performing computations while data is still in transit. However, computing within the network is challenging given the limited compute and memory resources of programmable network devices. The SciWiT project will implement a prototype system for performing scientific data analysis using in-network and near-network resources in an optimal way.

Many scientific workflows continuously monitor a phenomenon in search for rare events. This process generates enormous amount of data, thus researchers rely on change detection algorithms to locate the rare event information. SciWiT is a computing model where programmable network resources operate on the raw data streamed through it. While data is still in transit, network identifies the regions of interest from the data stream and to provide feedback to the instrument. SciWiT will inform how scientific workflows can leverage network-based in-transit computing and develop novel in-network and near-network computing mechanisms to operate on scientific data streams.

SciWiT will immediately benefit all scientific applications that rely on change detection. Moreover, this project will enhance the viability of making programmable networks an inherent computing element in the scientific data processing pipeline, effectively making the technology widely available to the scientific community. Similar to cloud environments, in-transit computing environments distributed across campuses will onboard scientific computing community to leverage the benefits of high speed programmable networks. Wide-spread adoption of the developed solutions and the downstream research enabled by the findings of this project can result in dramatic acceleration of the scientific discovery process through a fractional increase in the resources, thus benefiting the wider public.

Main Project Website