Improving Generalization of Deep Convolutional Neural Networks for Acoustic Scene Classification

Paischer, Fabian

Zobrazit/otevřít

Plný text práce (5.168Mb)

Posudek vedoucího práce (112.8Kb)

Posudek oponenta práce (112.8Kb)

Průběh obhajoby práce (181.7Kb)

Datum

2018

Autor

Paischer, Fabian

Metadata

Zobrazit celý záznam

Abstrakt

In recent years deep learning has become one of the most popular machine learning techniques for a vast variety of complex problems. An example for such a task is to mirror the human auditory system to classify audio recordings according to the location they were recorded in. This work focuses mainly on the Acoustic Scene Classification task proposed by the IEEE DCASE Challenge. The dataset for Acoustic Scene Classification consists of recordings from distinct recording locations. The aim of the challenge is to classify an unseen test set of recordings. In the challenge of 2016 the training and test set did not differ significantly. In the challenge of 2017, however, the test set originated from a different distribution, implying a strong need for generalization. In the course of this work, the initial implementation consisting of a Deep Convolutional Neural Network for the DCASE 2016 challenge submission (done in Lasagne) was re-implemented in Keras. An extension of the ADAM optimizer (AMSGrad) was investigated for improvement in generalization. Other submissions to the DCASE 2017 challenge suggest that different types of spectrograms might be key for better generalization. Therefore experiments utilizing different kinds of spectrograms were conducted. Furthermore, different interpolation algorithms were used for data augmentation, with some of them yielding significant improvements in classification accuracy and generalization. For different spectrogram dimensions, slight adjustments in the network architecture also resulted in a performance gain. To better understand what different models "see" and what they focus on, their filters, and activations were visualized and compared for differences. Finally the adjustments which led to better generalization on the dataset of the DCASE 2016 challenge were tested on the dataset of the DCASE 2017 challenge, leading to an improvement over all submissions to the DCASE 2017 challenge from the Institute of Computational Perception.

URI

https://dspace.jcu.cz/handle/123456789/38568

Kolekce

Přírodovědecká fakulta