Toward Developing a Repository of Logical Errors Observed in Parallel Code for Teaching Code Correctness
Authors: Trung Nguyen (University of Massachusetts), Ritu Arora (Texas Advanced Computing Center, University of Texas)
Abstract: Debugging parallel programs can be a challenging task, especially for the beginners. While parallel debuggers like DDT and TotalView can be extremely useful in tracking down the program statements that are connected to the bugs, often the onus is on the programmers to reason about the logic of the program statements in order to fix the bugs in them. These debuggers may neither be able to precisely indicate the logical errors in the parallel programs nor they may provide information on fixing those errors. Therefore, there is a need for developing tools and educational content on teaching the pitfalls in parallel programming and writing correct code. Such content can help in guiding the beginners in avoiding commonly observed logical errors and in verifying the correctness of their parallel programs. In this paper, we 1) enumerate some of the logical errors that we have seen in the parallel programs that were written by the beginners working with us, and 2) discuss the ways to fix those errors. The documentation on these logical errors can contribute in enhancing the productivity of the beginners, and can potentially help them in their debugging efforts. We have added the code samples containing the logical errors and their solutions in a Github repository so that the others in the community can reproduce the errors on their systems and learn from them. The content presented in this paper may also be useful for those developing high-level tools for detecting and removing logical errors in parallel programs.
Back to Workshop on Education for High Performance Computing (EduHPC) Archive Listing
Back to Full Workshop Archive Listing