BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160904Z
LOCATION:C2/3/4 Ballroom
DTSTART;TZID=America/Chicago:20181113T083000
DTEND;TZID=America/Chicago:20181113T170000
UID:submissions.supercomputing.org_SC18_sess322_post154@linklings.com
SUMMARY:Binarized ImageNet Inference in 29us
DESCRIPTION:Poster\nTech Program Reg Pass, Exhibits Reg Pass\n\nBinarized 
 ImageNet Inference in 29us\n\nGeng, Li, Wang, Song, Herbordt\n\nWe propose
  a single-FPGA-based accelerator for ultra-low-latency inference of ImageN
 et in this work. The design can complete the inference of Binarized AlexNe
 t within 29us with accuracy comparable to other BNN implementations.  We a
 chieve this performance with the following contributions: 1. We completely
  remove floating-point from NL through layer fusion. 2. By using model par
 allelism rather than data parallelism, we can simultaneously configure all
  layers and the control flow graphs.  Also, the design is flexible enough 
 to achieve nearly perfect load balancing, leading to extremely high resour
 ce utilization. 3. All convolution layers are fused and processed in paral
 lel through inter-layer pipelining. Therefore, in case the pipeline is ful
 l, latency is just the delay of a single convolution layer plus the FC lay
 ers. Note that the dependency pattern of the FC layer prevents it from bei
 ng integrated into the current pipeline.
URL:https://sc18.supercomputing.org/presentation/?id=post154&sess=sess322
END:VEVENT
END:VCALENDAR

