Lorena A. Barba group

Reproducible and replicable CFD: it's harder than you think

Choosing an incorrect boundary condition with IBAMR resulted in spurious blockage of vortices at the outlet. Thanks to the support received in the code users' forum, this and other problems were resolved and we succeeded in replicating previous findings.

Choosing an incorrect boundary condition with IBAMR resulted in spurious blockage of vortices at the outlet. Thanks to the support received in the code users' forum, this and other problems were resolved and we succeeded in replicating previous findings.

Reproducible and replicable CFD: it's harder than you think.

Submitted: 13 May 2016. Preprint arXiv:1605.04339
Accepted: 13 Oct. 2016, Computing in Science and Engineering.

Overview

We do our best to accomplish reproducible research and have for years worked to improve our practices to achieve this goal. Barba made a pledge in 2012, the “Reproducibility PI Manifesto,” according to which all our research code is under version control and open source, all our data is open, and we publish open pre-prints of all our publications. For the main results in a paper, we prepare file bundles with input and output data, plotting scripts and figure, and deposit them in an open repository. The core of this pledge is releasing the code, the data, and the analysis/visualization scripts. Already this can be time consuming and demanding. Yet, we have come to consider these steps the most basic level of reproducible research.

Recently, on undertaking a full replication study of a previous publication by our own research group (Krishnan et al., 2014), we came to realize how much more rigor is required to achieve this standard, in the context of our science domain (computational fluid dynamics of unsteady flows).

The lessons learned from attempting to replicate our own results were sobering. First, the vigilant practice of reproducible research should go beyond the open sharing of data and code. We now use Python scripts to automate our workflow completely—all scripts are version-controlled, code-documented and accept command-line arguments (to avoid code modification from users). We do not use GUIs for any visualization package. Instead, we call the Python interpreter included in the visualization tools to make all our plots. Throughout, we use Jupyter Notebooks and Markdown files to document partial project advances. Second, certain application scenarios pose special challenges. Our case: unsteady flows dominated by vortex dynamics, is a particularly tough application for reproducibility. Third, extra care is needed when using external code libraries (many scientists rely on linear solvers, for example): they may introduce uncertainties that we can’t control.

In this paper, we distilled the lessons learned from a full replication study of a published work completed by our own group. We used four distinct code bases to study the same problem in aerodynamics: the appearance of a lift-enhancement mechanism at a certain angle of attack for the geometry of a flying snake's body cross-section. Each case brought to the surface challenges that can lead to incorrect results from fluid solvers. Mesh generation, boundary conditions, differences in the external libraries used for solving linear systems: all these aspects can lead to a simulation going awry. We give concrete examples, and end with recommended standards of evidence for computational fluid dynamics, which may apply in other areas of computational physics.

References

  • Lorena et.al.

    This is indeed a very important paradigm for researchers to come in future.
    Congratulations for driving this and bringing out the arxiv preprint.

    All the best ahead!

    Best,
    Vijay

  • Boyce Griffith

    A quick comment on the "back flow divergence" that you observed in IBAMR, which you have chosen to highlight in this post. This is a pretty well-known issue with open boundary conditions — there is a shear instability that requires additional physical stabilization. Because this is a physical instability, not merely a numerical one, it appears in many different kinds of discretization methods. There is canned code for stabilizing it in IBAMR that has been available for several years [1]. We also discussed this issue with Oliver when he asked about it on the IBAMR Google Group [2]. I don't know of a fool-proof way to avoid this kind of stuff — if I did, it would be the default in the code! By the way, the boundary conditions are not zero-gradient conditions, but rather are zero-traction conditions, as described in detail here [3].

    [1] https://github.com/IBAMR/IBAMR/tree/master/examples/IB/explicit/ex2
    [2] https://groups.google.com/forum/#!topic/ibamr-users/EEHzVxMyLWw
    [3] B.E. Griffith. An accurate and efficient method for the incompressible Navier-Stokes equations using the projection method as a preconditioner. J Comput Phys, 228(20):7565-7595, 2009

    • Thanks for the comment and additional details on this issue. The final simulations with IBAMR did use the stabilized outflow—thanks to the recommendation posted on the Google Group—and the results do in the end reproduce the previous findings.

      • Boyce Griffith

        And yet, the figure makes it appear that this is a fundamental problem with the library.

        • Hmmm... that is really not the point. I'm happy to modify the caption, if you are concerned some who do not read the paper—where it is made clear that it was our mistake and it is corrected in the end—might blame it on the code and not the user.

          • Amneet Bhalla

            Thanks Lorena for modifying the caption!

  • Amneet Bhalla

    Two minor comments:

    (1) Note that requiring the body to be filled by volumetric markers is baked in the continuum and discrete equations as given in the reference [1] and this practice has been prevalent in the distributed Lagrange multiplier (DLM) literature for the past 16 years.

    (2) The 30 degree match of forces using IBAMR looks pretty good. I suspect at higher angle of attack, maybe the BCs are affecting the solution. Can you make the domain bigger, by say 50%, and redo the calculations at the same spatial resolution? Using adaptive grids in IBAMR make such calculations feasible.

    [1] Bhalla et. al. A unified mathematical framework and an adaptive numerical method for fluid–structure interaction with rigid, deforming, and elastic bodies. J Comput Phys, 250:446-476, 2013.

  • Efrem Braun

    Lorena,

    Just read your fantastic post in Science. Some of my research colleagues and I discussed it, and we were wondering your thoughts on including in a paper's SI the input scripts we used for open-source software, e.g. LAMMPS for molecular dynamics simulations. I noticed you choose to include input scripts as figshare DOIs instead of having them in the SI. Any reason for this?

    • What is SI?

      • Efrem Braun

        Supporting/Supplementary Information.

        • Ah! I suppose I could've guessed that, ha ha. I'm used to the term "supplementary materials."

          The conditions may vary from journal to journal, but the first reason is getting a DOI for each research object, so that it is citable on its own. Journal supplementary materials don't usually get their own DOI. Journals may also not count metrics on supplementary materials, whereas on Figshare you get views and downloads on each object. Plus altmetrics.

          Another big reason is that I claim copyright on the research objects (including figures) and release them under CC-BY, so anyone else can reuse them without asking permission. Some journals may claim copyright of supplementary materials, or may not give you the option to license them. See:
          https://storify.com/LorenaABarba/reactions-to-my-tip-on-how-i-use-figshare