Search

Neural Audio Codec-Based Audio Fingerprinting

Affiliation
MAC
Presenter
김종수
Personal Link
Subject
Audio Fingerprinting
Site
B9
Time
Poster Session II - 13:30~15:00
1 more property

Abstract

Although audio fingerprinting systems has been developing, it still has a limitation to accurately identify songs when distorted audio is provided. Distortions can occur in various forms based on the system's usage, necessitating a robust audio fingerprinting system suitable for all applications. To address these limitations, recent studies in audio fingerprinting system have suggested deep learning models such as Separable CNN and Attention-based models, with Mel spectrograms as inputs to the models. Among these, the contrastive learning method is optimized for audio fingerprinting system. However, these recent studies have a limitation where certain frequency components are excluded during the analysis process due to the use of audio signals with low sampling rates. In this context, we propose utilizing the quantized features of Neural Codec as inputs to the model, instead of Mel spectrograms, for minimizing information loss in the input audio.