<span class="var-sub_title">FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures


Authors: Haidong Lan (Shandong University), Jintao Meng (Tencent Holdings Ltd), Christian Hundt (Johannes Gutenberg University Mainz), Bertil Schmidt (Johannes Gutenberg University Mainz), Minwen Deng (Tencent Holdings Ltd), Weiguo Liu (Shandong University), Yanjie Wei (Shenzhen Institutes of Advanced Technology), Shengzhong Feng (Shenzhen Institutes of Advanced Technology)

Abstract: This poster presents a fast inference computation library for ARM architecture named as CNNForward. CNNForward is trying to improve the efficiency of inference computation for convolutional neural networks on ARM-based multi-core and many-core architectures using both mathematical formula reconstruction/simplification and in-depth NEON instruction optimization. Experimental results reveal that, forward computation for VGG-16 on a server with 64 ARM A72 cores, CNNForward can scale up to 32 cores with an parallel efficiency of 33%, and achieve 35.4x, 8.7x and 10.6x speedup over Caffe+OpenBlas, Caffe2+Eigen and Caffe2+NNPACK, respectively.

Best Poster Finalist (BP): no

Poster: pdf
Poster summary: PDF


Back to Poster Archive Listing