Presentation
Mitigating Performance and Progress Variability in Iterative Asynchronous Algorithms
Author
Event Type
ACM Student Research Competition
Poster
TP
EX
TimeTuesday, November 13th8:30am - 5pm
LocationC2/3/4 Ballroom
DescriptionLarge HPC machines are susceptible to irregular performance. Factors like chip manufacturing differences, heat management, and network congestion combine to result in varying execution time for the same code and input sets. Asynchronous algorithms offer a partial solution. In these algorithms, fast workers are not forced to synchronize with slow ones. Instead they continue computing updates, and moving toward the solution, using the latest data available to them, which may have become stale (i.e. a number of iterations out of date compared to the most recent data). While this allows for high computational efficiency, the convergence rate of asynchronous algorithms tends to be lower.
To address this problem, we are using the unique properties of asynchronous algorithms to develop load balancing strategies for iterative asynchronous algorithms in both shared and distributed memory. Our poster shows how our solution attenuates noise, resulting in significant reduction progress imbalance and time-to-solution variability.
To address this problem, we are using the unique properties of asynchronous algorithms to develop load balancing strategies for iterative asynchronous algorithms in both shared and distributed memory. Our poster shows how our solution attenuates noise, resulting in significant reduction progress imbalance and time-to-solution variability.
Archive