Search

Reconstruction of Musical Features from EEG recordings during music listening

태그
Neuroscience
Motive
Investigating how we perceive music can broaden our understanding of the general process of auditory perception. Our study aims to reconstruct music using brain signals when people listen to and perceive music. We select electroencephalogram(EEG) as it is able to capture temporal dynamics during perception, which is important in music processing which is at the intersection of music, cognitive science and machine learning. Thus, we concentrate on these perspectives, and we want to present models and approaches to this problem in light of those three disciplines.
Work in progress
Not many researches attempted to decode music from EEG signals, especially full-length naturalistic music. The difficulty of reconstruction comes from the diversity of components consisting of music, and complex interactions in between. Therefore, we plan to start our research by investigating relatively simple stimuli considering a single element of music, and gradually expand to more complex and naturalistic stimuli.
During this process, it is crucial to select appropriate stimuli. Recently, an increasing number of papers introduce public datasets for music perception using EEG. Most studies focused on specific components of music such as tempo[1, 2] and rhythm[3, 4, 5]. Some researchers[6] also attempted to experiment the influence of various conditions like genre, melody and number of instruments at once.
To our knowledge, the most recent and only paper on this topic is Ofner and Stober[7] where Mel-spectrograms of the music is reconstructed from EEG. In order to reconstruct full audio signal ultimately, we would like to look at sub-class of problems within the idea of reconstruction, i.e, feature extraction.
In this study, we want to show that it is possible to reconstruct the features of music, and at last, natural music, using only EEG signal. In order to do that, at first we tried to model EEG signal from music as an input - a forward transform. After that work, we will try to backward transform, where natural music or features of music could be generated from EEG signal.
As human ear can process sound very efficiently through log-scale transform, so called mel-frequency encoding within the cochlea, we propose to train filters which can transform EEG signal into more robust features for down-stream tasks the model is supposed to implement.
As a backbone of EEGNet[8], we propose Deep Learning architecture which considers various-dimensional EEG data which comprises following dimensions: Time, Space, Frequency, etc.
Reference
[1] Losorelli, S., Nguyen, D. T., Dmochowski, J. P., & Kaneshiro, B. (n.d.). Nmed-t: A tempo-focused dataset of cortical and behavioral responses to naturalistic music. Retrieved October 13, 2021, from https://ccrma.stanford.edu/~blairbo/assets/pdf/losorelli2017ISMIR.pdf
[2] Stober, S., Sternin, A., & Grahn, A. M. O. A. J. A. (2015). TOWARDS MUSIC IMAGERY INFORMATION RETRIEVAL: INTRODUCING THE OPENMIIR DATASET OF EEG RECORDINGS FROM MUSIC PERCEPTION AND IMAGINATION. ISMIR.
[3] S. Stober, D. J. Cameron, and J. A. Grahn, “Using convolutional neural networks to recognize rhythm stimuli from electroencephalography recordings,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds., Curran Associates, Inc., 2014, pp. 1449– 1457.
[4] S. Stober, T. Pratzlich, and M. Muller, “Brain Beats: Tempo Extraction from EEG Data,” in Proceedings of the 17th International Society for Music Information Retrieval Conference, New York, NY, 2016.
[5] Appaji, J., & Kaneshiro, B. (n.d.). Neural tracking of simple and complex rhythms: Pilot study and dataset. Retrieved October 13, 2021, from https://ccrma.stanford.edu/~blairbo/assets/pdf/appaji2018ISMIR_LBD.pdf
[6] Cantisani, G., Trégoat, G., Essid, S., & Richard, G. (2019). MAD-EEG: an EEG dataset for decoding auditory attention to a target instrument in polyphonic music. Speech, Music and Mind (SMM), Satellite Workshop of Interspeech 2019.
[7] A. Ofner and S. Stober, “Shared Generative Representation of Auditory Concepts and EEG to Reconstruct Perceived and Imagined Music,” 19th International Society for Music Information Retrieval Conference – ISMIR 2018, pp. 392–399, 2018.
[8] Lawhern, V. J. et al. EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces. Journal of Neural Engineering 15, 056013 (2018).

If you have any questions, please contact the first author.

myeonghoon.ryu@snu.ac.kr (Myeonghoon Ryu)
Authors
Myeonghoon Ryu and Kyogu Lee