BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160728Z
LOCATION:D167/174
DTSTART;TZID=America/Chicago:20181112T143000
DTEND;TZID=America/Chicago:20181112T150000
UID:submissions.supercomputing.org_SC18_sess151_ws_mlhpce129@linklings.com
SUMMARY:Aluminum: An Asynchronous, GPU-Aware Communication Library Optimiz
 ed for Large-Scale Training of Deep Neural Networks on HPC Systems
DESCRIPTION:Workshop\nDeep Learning, Machine Learning, Workshop Reg Pass\n
 \nAluminum: An Asynchronous, GPU-Aware Communication Library Optimized for
  Large-Scale Training of Deep Neural Networks on HPC Systems\n\nDryden, Ma
 ruyama, Moon, Benson, Yoo...\n\nWe identify communication as a major bottl
 eneck for training deep neural networks on large-scale GPU clusters, takin
 g over 10x as long as computation. To reduce this overhead, we discuss tec
 hniques to overlap communication and computation as much as possible. This
  leads to much of the communication being latency-bound instead of bandwid
 th-bound, and we find that using a combination of latency- and bandwidth-o
 ptimized allreduce algorithms significantly reduces communication costs. W
 e also discuss a semantic mismatch between MPI and CUDA that increases ove
 rheads and limits asynchrony, and propose a solution that enables communic
 ation to be aware of CUDA streams. We implement these optimizations in the
  open-source Aluminum communication library, enabling optimized, asynchron
 ous, GPU-aware communication. Aluminum demonstrates improved performance i
 n benchmarks and end-to-end training of deep networks, for both strong and
  weak scaling.
URL:https://sc18.supercomputing.org/presentation/?id=ws_mlhpce129&sess=ses
 s151
END:VEVENT
END:VCALENDAR

