BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160908Z
LOCATION:D175
DTSTART;TZID=America/Chicago:20181112T090000
DTEND;TZID=America/Chicago:20181112T173000
UID:submissions.supercomputing.org_SC18_sess176@linklings.com
SUMMARY:LLVM-HPC2018: The Fifth Workshop on the LLVM Compiler Infrastructu
 re in HPC
DESCRIPTION:Workshop\nProgram Transformation, Programming Systems, Worksho
 p Reg Pass\n\nCompiler Optimization for Heterogeneous Locality and Homogen
 eous Parallelism in OpenCL and LLVM\n\nNuzman, Zuckerman, Zaks\n\nHeteroge
 neous platforms may include accelerators such as Digital Signal Processors
  (DSP’s) that employ SW-controlled scratch-pad memories instead of, or in 
 addition to standard HW-cached memory. Controlling scratch-pads efficientl
 y typically requires tiling and pipelining loops, thereby optimizing f...\
 n\n---------------------\nLLVM-HPC2018: Final Discussion\n\nFinkel\n\n----
 -----------------\nWorkshop Afternoon Break\n\nFinkel\n\n-----------------
 ----\nWorkshop Lunch (on your own)\n\nFinkel\n\n---------------------\nKey
 note: Glow: An Optimizing Compiler for High-Performance Machine Learning\n
 \nMaher\n\nMachine learning is an increasingly large fraction of datacente
 r workloads, making efficient execution of ML models a priority for indust
 ry. At the same time, the slow down of Moore's Law has created space for a
  plethora of innovative hardware designs to wring maximum performance from
  each transisto...\n\n---------------------\nChallenges of C++ Heterogeneo
 us Programming Using SYCL Implementation Experience: the Four Horsemen of 
 the Apocalypse\n\nLomuller\n\nThe C++ Direction Group has set a future dir
 ection for C++ and includes a guidance towards Heterogeneous C++. The intr
 oduction of the executors TS means for the first time in C++ there will be
  a standard platform for writing applications which can execute across a w
 ide range of architectures includi...\n\n---------------------\nA Study of
  OpenMP Device Offloading in LLVM: Correctness and Consistency\n\nYu\n\nTo
  leverage widely available accelerators, OpenMP has introduced device cons
 tructs. Device constructs simplify the development of heterogeneous parall
 el programs and improve the performance. Many compilers including Clang al
 ready have support for device constructs, but there exist few documentatio
 ns...\n\n---------------------\nIntroduction - LLVM-HPC2018: The Fifth Wor
 kshop on the LLVM Compiler Infrastructure in HPC\n\nFinkel\n\nLLVM, winner
  of the 2012 ACM Software System Award, has become an integral part of the
  software-development ecosystem for optimizing compilers, dynamic-language
  execution engines, source-code analysis and transformation tools, debugge
 rs and linkers, and a whole host of programming-language and toolc...\n\n-
 --------------------\nOpenMP GPU Offload in Flang and LLVM\n\nOzen, Atzeni
 , Wolfe, Southwell, Klimowicz\n\nGraphics Processing Units (GPUs) have bee
 n widely adopted to accelerate the execution of High Performance Computing
  (HPC) workloads due to their enormous computational throughput, ability t
 o execute a large number of threads inside SIMD groups in parallel, and th
 eir use of multithreaded hardware to ...\n\n---------------------\nLLVM an
 d the Automatic Vectorization of Loops Invoking Math Routines: -fsimdmath\
 n\nPetrogalli, Walker\n\nThe vectorization of loops invoking math function
  is an important optimization that is available in most commercial compile
 rs. This paper describes a new command line option, -fsimdmath, available 
 in Arm Compiler for HPC, that enables auto-vectorization of math functions
  in C and C++ code, and that ...\n\n---------------------\nAIWC: OpenCL-Ba
 sed Architecture Independent Workload Characterization\n\nJohnston, Miltho
 rpe\n\nMeasuring performance-critical characteristics of application workl
 oads is important both for developers, who must understand and optimize th
 e performance of codes, as well as designers and integrators of HPC system
 s, who must ensure that compute architectures are suitable for the intende
 d workloads...\n\n---------------------\nClacc: Translating OpenACC to Ope
 nMP in Clang\n\nDenny, Lee, Vetter\n\nOpenACC was launched in 2010 as a po
 rtable programming model for heterogeneous accelerators.  Although various
  implementations already exist, no extensible, open-source, production-qua
 lity compiler support is available to the community.  This deficiency pose
 s a serious risk for HPC application devel...\n\n---------------------\nFu
 nction/Kernel Vectorization via Loop Vectorizer\n\nMasten, Tyurin, Mitropo
 ulou, Saito, Garcia\n\nCurrently, there are three vectorizers in the LLVM 
 trunk: Loop Vectorizer, SLP Vectorizer, and Load-Store Vectorizer. There i
 s a need for vectorizing functions/kernels: 1) Function calls are an integ
 ral part of programming real world application code and we cannot always r
 ely on fully inlining them....\n\n---------------------\nPointers Inside L
 ambda Closure Objects in OpenMP Target Offload Regions\n\nTruby, Wright\n\
 nWith the diversification of HPC architectures beyond traditional CPU-base
 d clusters, a number of new frameworks for performance portability across 
 architectures have arisen. One way of implementing such frameworks is to u
 se C++ templates and lambda expressions to design loop-like functions. How
 ever,...\n\n---------------------\nPInT: Pattern Instrumentation Tool for 
 Analyzing and Classifying HPC Applications\n\nSchlebusch, Müller, Wienke, 
 Miller, Müller\n\nThe relationship of application performance to its requi
 red development effort plays an important role in today’s budget-oriented 
 HPC environment. This effort-performance relationship is especially affect
 ed by the structure and characterization of an HPC application. We aim at 
 a classification of HP...\n\n---------------------\nOP2-Clang: A Source-to
 -Source Translator Using Clang/LLVM LibTooling\n\nBalogh, Mudalige, Reguly
 , Antao, Bertolli\n\nDomain Specific Languages or Active Library framework
 s have recently emerged as an important method for gaining performance por
 tability, where an application can be efficiently executed on a wide range
  of HPC architectures without significant manual modifications. Embedded D
 SLs such as OP2, provides...\n\n---------------------\nUser-Directed Loop-
 Transformations in Clang\n\nKruse, Finkel\n\nDirectives for the compiler s
 uch as pragmas can help programmers to separate an algorithm's semantics f
 rom its optimization. This keeps the code understandable and easier to opt
 imize for different platforms. Simple transformations such as loop unrolli
 ng are already implemented in most mainstream com...\n\n------------------
 ---\nWorkshop Morning Break\n\nFinkel\n
END:VEVENT
END:VCALENDAR