Correctness of Floating Point Programs - Exception Handling and Reproducibility
Authors: James Demmel (University of California, Berkeley)
Abstract: We consider two related aspects of analyzing and guaranteeing correctness of floating point programs: exception handling and reproducibility. Exception handling refers to reliable and consistent propagation of errors due to overflow, invalid operations (like sqrt(-1)), convergence failures, etc. Reproducibility refers to getting bitwise reproducible results from multiple runs of the same program, e.g., despite parallelism causing floating point sums to be evaluated in different order with different roundoff errors. We describe the efforts of two standards committees, the Basic Linear Algebra Subprograms (BLAS) Standard, and the IEEE 754 Floating Point Standard, to address these issues, and how these efforts should make it easier to accomplish these goals for higher level applications, such as linear algebra libraries.
Back to 2nd International Workshop on Software Correctness for HPC Applications (Correctness 2018) Archive Listing
Back to Full Workshop Archive Listing