Speed-Up of Machine Learning for Sound Localization via High-Performance Computing

Eric Michael Sumner
Marcel Aach
Andreas Lintermann
Runar Unnthorsson
Morris Riedel

26th International Conference on Information Technology (IT), Žabljak, February 2022

doi: 10.1109/it54280.2022.9743519
Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationally-intensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech GPU resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.
Master's Thesis

Eric Michael Sumner

Háskóli Íslands


The choice of how to organize data in memory has a significant effect on the overall efficiency of computer programs. Changing requirements over time may require this choice to be revisited after a program’s initial development has been completed. Historically, data retrieval routines and program logic have been strongly coupled, which hinders the maintenance programmer’s ability to reorganize a program’s data without introducing new logic faults.

This paper proposes a new framework, Memquery, which separates these two concerns, allowing a program’s internal data to be reorganized without major changes to its logic routines. The theoretical basis for this framework is relational algebra, which has served a similar role in database systems for the past 50 years. A prototype of the Memquery framework is presented. A comparison study with traditional development techniques demonstrates that this prototype is capable of producing more maintainable programs with similar performance characteristics.