<span class="var-sub_title">Hardware Acceleration of CNNs with Coherent FPGAs</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Hardware Acceleration of CNNs with Coherent FPGAs

Authors: Md Syadus Sefat (Texas State University), Semih Aslan (Texas State University), Apan Qasem (Texas State University)

Abstract: This paper describes a new flexible approach to implementing energy-efficient CNNs on FPGAs. Our design leverages the Coherent Accelerator Processor Interface (CAPI) which provides a cache-coherent view of system memory to attached accelerators. Convolution layers are formulated as matrix multiplication kernels and then accelerated on CAPI-supported Kintex FPGA board. Our implementation bypasses the need for device driver code and significantly reduces the communication and I/O transfer overhead. To improve the performance of the entire application, not just the convolution layers, we propose a collaborative model of execution in which the control of the data flow within the accelerator is kept independent, freeing-up CPU cores to work on other parts of the application. For further performance enhancements, we propose a technique to exploit data locality in the cache, situated in the CAPI Power Service Layer (PSL). Finally, we develop a resource-conscious implementation for more efficient utilization of resources and improved scalability.

Best Poster Finalist (BP): no

Poster: pdf
Poster summary: PDF

Back to Poster Archive Listing