Authors: David Bernholdt (Oak Ridge National Laboratory), Jeffrey Carver (University of Alabama), William Gropp (University of Illinois, National Center for Supercomputing Applications), Carina Haupt (German Aerospace Center), Michael Heroux (Sandia National Laboratories, St. John’s University), Daniel Katz (University of Illinois), Scott Lathrop (National Center for Supercomputing Applications, Shodor Education Foundation), Minhua Wen (Shanghai Jiao Tong University)
Abstract: Software engineering (SWE) for computational science and engineering (CSE) is challenging, with ever-more sophisticated and higher fidelity simulation of ever-larger and more complex problems involving larger data volumes, more domains, and more researchers. Targeting both commodity and custom high-end computers multiplies these challenges. We invest a great deal in creating these codes, but rarely talk about that experience; we just focus on the results.
Our goal is to raise awareness of SWE for CSE on supercomputers as a major challenge, and to develop an international “community of practice” to continue these important discussions outside of workshops and other “traditional” venues.
Long Description: The engineering of software for computational science and engineering (CSE) gets little attention in our community. We celebrate the big machines, the scientific discoveries they enable when driven by sophisticated software, and the cleverness and creativity of the software itself. More rarely do we talk about how user requirements are assessed, how that software was designed, the successes and failures of the development processes used, testing and verification strategies that maximize confidence in the code while minimizing the use of expensive resources, how end user feedback is collected and used to drive improvements, and many other aspects of the entire lifecycle of a CSE application, including portability, sustainability, overall productivity, and usability by and for the community.
At the same time, the pace of change and level of diversity in architectures have increased dramatically, and the drive to exascale exacerbates the situation. CSE software developers already facing scientific demands for “bigger, better, and faster” modeling and simulation capabilities, entailing larger, more multidisciplinary and geographically dispersed development teams, must now also contend with significant architectural changes. Further, increases in data volume and complexity, and the increasing integration of “big data” (analytics) infrastructures (both hardware and software) raise additional SWE challenges.
We believe this situation has the makings of a serious Software Crisis in CSE on HPC, which we ignore at our own expense in scientific productivity and opportunity. Fortunately, a growing number of organizations are paying more attention to addressing this challenge. But their work is not yet widely shared, and the sharing and uptake of good practices is fragmented. We believe that the next step in the process is a concerted effort to increase awareness and sharing of work on SWE for HPC CSE across the community, with the aim of fostering good practices that will result in software fit to power CSE through the next era of computing.
Our goal is to bring together people concerned about this topic to share existing activities, discuss how we can expand and improve on them, and share the results, complementing “traditional” venues for the academic (often versus practical) discussion of SWE for CSE, such as conferences and workshops. An interactive Google Doc will be used to collaboratively take notes of the discussion. These notes will be made publicly available and will form the basis of a community report and ongoing community forum.
The SC Conference Series provides an ideal venue for these discussions. A large fraction of the attendees are CSE practitioners or researchers who support such activities. Past editions of this BoF (2015, 2016, 2017) have been very well attended and the discussions highly engaged. SC18 will also host a number of complementary activities, including workshops on reproducibility and software correctness, a tutorial on Better Scientific Software, and we are aware of a BOF proposal on Research Software Engineering. There is also the overall SC18 Reproductibility Initiative. We believe these activities are highly complementary and will be synergistic in generating interest and participation from the SC community.
Back to Birds of a Feather Archive Listing