Presentation
FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures
SessionResearch Posters
Authors
Event Type
Poster
TP
EX
TimeTuesday, November 13th8:30am - 5pm
LocationC2/3/4 Ballroom
DescriptionThis poster presents a fast inference computation library for ARM architecture named as CNNForward. CNNForward is trying to improve the efficiency of inference computation for convolutional neural networks on ARM-based multi-core and many-core architectures using both mathematical formula reconstruction/simplification and in-depth NEON instruction optimization. Experimental results reveal that, forward computation for VGG-16 on a server with 64 ARM A72 cores, CNNForward can scale up to 32 cores with an parallel efficiency of 33%, and achieve 35.4x, 8.7x and 10.6x speedup over Caffe+OpenBlas, Caffe2+Eigen and Caffe2+NNPACK, respectively.
Archive
Authors









