<span class="var-sub_title">Training Speech Recognition Models on HPC Infrastructure</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Machine Learning in HPC Environments

Training Speech Recognition Models on HPC Infrastructure

Authors: Deepthi Karkada (Intel Corporation)

Abstract: Automatic speech recognition is used extensively in speech interfaces and spoken dialogue systems. To accelerate the development of new speech recognition models and techniques, developers at Mozilla have open sourced a deep learning based Speech-To-Text engine known as project DeepSpeech based on Baidu’s DeepSpeech research. In order to make model training time quicker on CPUs for DeepSpeech distributed training, we have developed optimizations on the Mozilla DeepSpeech code to scale the model training to a large number of Intel® CPU system, including Horovod integration into DeepSpeech. We have also implemented a novel dataset partitioning scheme to mitigate compute imbalance across multiple nodes of an HPC cluster. We demonstrate that we are able to train the DeepSpeech model using the LibriSpeech clean dataset to its state-of-the-art accuracy in 6.45Hrs on 16-Node Intel® Xeon® based HPC cluster.

Archive Materials

Back to Machine Learning in HPC Environments Archive Listing

Back to Full Workshop Archive Listing