<span class="var-sub_title">Identifying Network Data Transfer Bottlenecks in HPC Systems</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Identifying Network Data Transfer Bottlenecks in HPC Systems

Student: Karen Tu (Lawrence Berkeley National Laboratory; University of California, Berkeley)
Supervisor: Alexander Sim (Lawrence Berkeley National Laboratory)

Abstract: Improving network data transfer performance is a major factor for improving high performance computing systems. Most studies analyze data transfer and file system IO performance separately, but understanding the relationship between the two is essential for optimizing scheduling and resource management. Intuitively, if data is being transferred to a busy file system the transfer rate would be slower than a file system at regular activity levels.

This study analyzes patterns between file system activity and network throughput for several use cases of file writing and data transfers using a parallel file system. The parameters changed among the use cases were file striping for the file system, and buffer size and parallelism for data transfer. The main bottleneck for network data transfer rate was the number of OSTs the data was striped across. For a large number of OSTs (16 or greater), writing to the file system was the bottleneck.

ACM-SRC Semi-Finalist: no

Poster: PDF
Poster Summary: pdf
Reproducibility Description Appendix: PDF

Back to Poster Archive Listing