BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160727Z
LOCATION:D163
DTSTART;TZID=America/Chicago:20181111T163000
DTEND;TZID=America/Chicago:20181111T170000
UID:submissions.supercomputing.org_SC18_sess159_ws_indis103@linklings.com
SUMMARY:Analysis of CPU Pinning and Storage Configuration in 100 Gbps Netw
 ork Data Transfer
DESCRIPTION:Workshop\nArchitectures, Networks, Security, Workshop Reg Pass
 \n\nAnalysis of CPU Pinning and Storage Configuration in 100 Gbps Network 
 Data Transfer\n\nYu, Chen, Mambretti, Yeh\n\nA common bottleneck for high-
 speed network data transfers is lack of CPU resources. A number of techniq
 ues and solutions have been proposed to reduce CPU load for data transfer.
  One can optimize the core affinity settings in their Non-Uniform Memory A
 ccess (NUMA) system and use NVMe over Fabrics to avoid CPU bottlenecks in 
 high-speed network data transfers. Our assumption is that binding processe
 s to the local processor improves the overall performance of the high-spee
 d network data transfers compared to binding the processes to actual cores
  or leaving them unbounded. Furthermore, using NVMe over Fabrics reduces t
 he CPU utilization more with a lower number of processors. To evaluate the
 se assumptions, we performed a series of experiments with different core a
 ffinity and storage settings. We found evidence that binding processes to 
 the local processor instead of the cores improve the file transfer perform
 ance for most of the use-cases and NVMe over Fabrics is more efficient in 
 transferring files compared to traditional file transfers in Local Area Ne
 tworks (LANs). We were able to achieve the maximum SSD performance thresho
 ld using 32 transfer processes with traditional file transfers while using
  8 processes with NVMe over Fabrics and reduced CPU utilization.
URL:https://sc18.supercomputing.org/presentation/?id=ws_indis103&sess=sess
 159
END:VEVENT
END:VCALENDAR

