Poster: Research Posters
Event TypePoster
Registration Categories
TimeWednesday, November 14th8:30am - 5pm
DescriptionSC18 Research Posters will be on display on Tuesday, Wednesday, Thursday from 8:30am to 5pm in the C2/3/4 Ballroom.
Exploring Application Performance on Fat-Tree Networks in the Presence of Congestion
GPU-Accelerated Interpolation for 3D Image Registration
Energy Efficiency of Reconfigurable Caches on FPGAs
RGB (Redfish Green500 Benchmarker): A Green500 Benchmarking Tool Using Redfish
Optimization of Ultrasound Simulations on Multi-GPU Servers
GPGPU Performance Estimation with Core and Memory Frequency Scaling
Making Sense of Scientific Simulation Ensembles
Which Architecture Is Better Suited for Matrix-Free Finite-Element Algorithms: Intel Skylake or Nvidia Volta?
SpotSDC: an Information Visualization System to Analyze Silent Data Corruption
High-Accuracy Scalable Solutions to the Dynamic Facility Layout Problem
Large Scale MPI-Parallelization of LBM and DEM Systems: Accelerating Research by Using HPC
SciGaP: Apache Airavata Hosted Science Gateways
Reproducibility as Side Effect
Using Darshan and CODES to Evaluate Application I/O Performance
Multi-Client DeepIO for Large-Scale Deep Learning on HPC Systems
HIVE: A Cross-Platform, Modular Visualization Ecosystem for Heterogeneous Computational Environments
Improving the I/O Performance and Memory Usage of the Xolotl Cluster Dynamics Simulator
Performance Evaluation of the Shifted Cholesky QR Algorithm for Ill-Conditioned Matrices
HPC-as-a-Service for Life Sciences
Hermes: a Multi-Tiered Distributed I/O Buffering System for HDF5
Workflow for Parallel Processing of Sequential Mesh Databases
The NAStJA Framework: Non-Collective Scalable Global Communications
Hardware Acceleration of CNNs with Coherent FPGAs
Distributed Fast Boundary Element Methods
Development of Numerical Coupled Analysis Method by Air Flow Analysis and Snow Accretion Analysis
Portable Parallel Performance via Multi-Dimensional Homomorphisms
Performance Evaluation of the NVIDIA Tesla V100: Block Level Pipelining vs. Kernel Level Pipelining
Enabling Data Analytics Workflows Using Node-Local Storage
OpeNNdd: Open Neural Networks for Drug Discovery: Creating Free and Easy Methods for Designing Medicine
FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures
Boosting the Scalability of Car-Parrinello Molecular Dynamics Simulations for Multi- and Manycore Architectures
Characterizing Declustered Software RAID for Enhancing Storage Reliability and Performance
Parallel Implementation of Machine Learning-Based Many-Body Potentials on CPU and GPU
Implementing Efficient Data Compression and Encryption in a Persistent Key-Value Store for HPC
A Parallel-Efficient GPU Package for Multiphase Flow in Realistic Nano-Pore Networks
Processing-in-Storage Architecture for Machine Learning and Bioinformatics
Kernel-Based and Total Performance Analysis of CGYRO on 4 Leadership Systems
Redesigning The Absorbing Boundary Algorithm for Asynchronous High Performance Acoustic Wave Propagation
Capsule Networks for Protein Structure Classification
Cross-Layer Group Regularization for Deep Neural Network Pruning
Machine Learning for Adaptive Discretization in Massive Multiscale Biomedical Modeling
Multi-GPU Accelerated Non-Hydrostatic Numerical Ocean Model with GPUDirect RDMA Transfers
A Locality and Memory Congestion-Aware Thread Mapping Method for Modern NUMA Systems
Tuning CFD Applications for Intel Xeon Phi with TAU Commander and ParaTools ThreadSpotter
Massively Parallel Stress Chain Characterization for Billion Particle DEM Simulation of Accretionary Prism Formation
Toward Smoothing Data Movement Between RAM and Storage
Interactive HPC Deep Learning with Jupyter Notebooks
Fast and Accurate Training of an AI Radiologist
Full State Quantum Circuit Simulation by Using Lossy Data Compression
Enabling Reproducible Microbiome Science through Decentralized Provenance Tracking in QIIME 2
Optimizing Next Generation Hydrodynamics Code for Exascale Systems
MPI/OpenMP parallelization of the Fragment Molecular Orbitals Method in GAMESS
Automatic Generation of Mixed-Precision Programs
UPC++ and GASNet-EX: PGAS Support for Exascale Applications and Runtimes
An Efficient SIMD Implementation of Pseudo-Verlet Lists for Neighbor Interactions in Particle-Based Codes
Understanding Potential Performance Issues Using Resource-Based alongside Time Models
MGRIT Preconditioned Krylov Subspace Method
Enabling Neutrino and Antineutrino Appearance Observation Measurements with HPC Facilities
Large Scale Computation of Quantiles Using MELISSA
FlowOS-RM: Disaggregated Resource Management System
Programming the EMU Architecture: Algorithm Design Considerations for Migratory-Threads-Based Systems
OpenACC to FPGA: A Directive-Based High-Level Programming Framework for High-Performance Reconfigurable Computing
Tensor-Optimized Hardware Accelerates Fused Discontinuous Galerkin Simulations
AI Matrix – Synthetic Benchmarks for DNN
Applying the Execution-Cache-Memory Model: Current State of Practice
WarpX: Toward Exascale Modeling of Plasma Particle Accelerators
Job Simulation for Large-Scale PBS-Based Clusters with the Maui Scheduler
Script of Scripts Polyglot Notebook and Workflow System
Enabling High-Level Graph Processing via Dynamic Tasking
An Alternative Approach to Teaching Bigdata and Cloud Computing Topics at CS Undergraduate Level
Binarized ImageNet Inference in 29us
Refactoring and Optimizing Multiphysics Combustion Models for Data Parallelism
Tensorfolding: Improving Convolutional Neural Network Performance with Fused Microkernels
MATEDOR: MAtrix, TEnsor, and Deep-Learning Optimized Routines
Accelerating Wave-Propagation Algorithms with Adaptive Mesh Refinement Using the Graphics Processing Unit (GPU)
Distributed Adaptive Radix Tree for Efficient Metadata Search on HPC Systems
Improving Error-Bounded Lossy Compression for Cosmological N-Body Simulation
VeloC: Very Low Overhead Checkpointing System
Estimating Molecular Dynamics Chemical Shift with GPUs
Using Thrill to Process Scientific Data on HPC
GPU Acceleration at Scale with OpenPower Platforms in Code_Saturne
Large-Message Size Allreduce at Wire Speed for Distributed Deep Learning
Sol: Transparent Neural Network Acceleration Platform
Detection of Silent Data Corruptions in Smooth Particle Hydrodynamics Simulations
DeepSim-HiPAC: Deep Learning High Performance Approximate Calculation for Interactive Design and Prototyping
Top-Down Performance Analysis of Workflow Applications
Convolutional Neural Networks for Coronary Plaque Classification in Intravascular Optical Coherence Tomography (IVOCT) Images
Compiling SIMT Programs on Multi- and Many-Core Processors with Wide Vector Units: A Case Study with CUDA
A Massively Parallel Evolutionary Markov Chain Monte Carlo Algorithm for Sampling Complicated Multimodal State SpacesState
MLModelScope: Evaluate and Measure Machine Learning Models within AI Pipelines
A Compiler Framework for Fixed-Topology Non-Deterministic Finite Automata on SIMD Platforms
A Low-Communicaton Method to Solve Poisson's Equation on Locally-Structured Grids
Floating-Point Autotuner for CPU-Based Mixed-Precision Applications
