Research Workflows - Towards Reproducible Science via Detailed Provenance Tracking in Open Science Chain

TitleResearch Workflows - Towards Reproducible Science via Detailed Provenance Tracking in Open Science Chain
Publication TypeConference Paper
Year of Publication2020
AuthorsNandigam V, Lin K, Shantharam M, Sakai S, Sivagnanam S
Conference NamePractice and Experience in Advanced Research Computing
PublisherAssociation for Computing Machinery
Conference LocationNew York, NY, USA
ISBN Number9781450366892
KeywordsBlockchain, Data Integrity, Data Provenance, Data Reproducibility

Scientific research has always struggled with problems related to reproducibility caused in part by low data sharing rates and lack of provenance. Credibility of the research hypothesis comes into question when results cannot be replicated. While the growing amount of data and widespread use of computational code in research has been pushing scientific breakthroughs, their references in scientific publications is insufficient from a reproducibility perspective. The NSF funded Open Science Chain (OSC) is a cyberinfrastructure platform built using blockchain technologies that enables researchers to efficiently validate the authenticity of published data, track their provenance and view lineage. It does this by leveraging blockchain technology to securely store metadata and verification information about research data and track changes to that data in an auditable manner. In this poster we introduce the concept of ”research workflows”, a tool that allows researchers to create a detailed workflow of their scientific experiment, linking specific data and computational code used in their published results in order to enable independent verification of the analysis. OSC research workflows will allow for detailed provenance tracking both within the OSC platform as well as external repositories like Github, thereby enabling transparency and fostering trust in the scientific process.