Lorena A. Barba group

Reproducible workflow on a public cloud for computational fluid dynamics

Reproducible workflow on Microsoft Azure using containers and command-line configuration of the public cloud services.

Submitted: April 18, 2019. Preprint: arXiv 1904.07981
Revised: Aug. 22, 2019
Accepted: Sep. 4, 2019

Abstract

In a new effort to make our research transparent and reproducible by others, we developed a workflow to run and share computational studies on the public cloud Microsoft Azure. It uses Docker containers to create an image of the application software stack. We also adopt several tools that facilitate creating and managing virtual machines on compute nodes and submitting jobs to these nodes. The configuration files for these tools are part of an expanded "reproducibility package" that includes workflow definitions for cloud computing, in addition to input files and instructions. This facilitates re-creating the cloud environment to re-run the computations under the same conditions. Although cloud providers have improved their offerings, many researchers using high-performance computing (HPC) are still skeptical about cloud computing. Thus, we ran benchmarks for tightly coupled applications to confirm that the latest HPC nodes of Microsoft Azure are indeed a viable alternative to traditional on-site HPC clusters. We also show that cloud offerings are now adequate to complete computational fluid dynamics studies with in-house research software that uses parallel computing with GPUs. Finally, we share with the community what we have learned from nearly two years of using Azure cloud to enhance transparency and reproducibility in our computational simulations.

Reproducibility packages

Documentation of the results presented in the paper, manuscript source files (.tex), Docker files, and supplementary materials are shared in the paper's GitHub repository: https://github.com/barbagroup/cloud-repro.

Peer-review reports

Our responses to the peer-reviewer reports were developed in the issue tracker of the manuscript GitHub repository.

Reference