<span class="var-sub_title">Precomputing Outputs of Hidden Layers to Speed Up Deep Neural Network Training</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Precomputing Outputs of Hidden Layers to Speed Up Deep Neural Network Training

Student: Sohil Lal Shrestha (University of Texas, Arlington)
Supervisor: Christoph Csallner (University of Texas, Arlington)

Abstract: Deep learning has recently emerged as a powerful technique for many tasks including image classification. A key bottleneck of deep learning is that the training phase takes a lot of time, since state-of-the-art deep neural networks have millions of parameters and hundreds of hidden layers. The early layers of these deep neural networks have the fewest parameters but take up the most computation.

In this work, we reduce training time by progressively freezing hidden layers, pre-computing their output and excluding them from training in both forward and backward paths in subsequent iterations. We compare this technique to the most closely related approach for speeding up the training process of neural network.

Through experiments on two widely used datasets for image classification, we empirically demonstrate that our approach can yield savings of up to 25% wall-clock time during training with no loss in accuracy.

ACM-SRC Semi-Finalist: no

Poster: PDF
Poster Summary: pdf
Reproducibility Description Appendix: PDF

Back to Poster Archive Listing