The problem of note onset detection in musical signals is considered. The proposed solution is based on known approaches in which an onset detection function is defined on the basis of spectral characteristics of audio data. In our approach, several onset detection functions are used simultaneously to form an input vector for a multi-layer non-linear perceptron, which learns to detect onsets in the training data. This is in contrast to standard methods based on thresholding the onset detection functions with a moving average or a moving median. Our approach is also different from most of the current machinelearning-based solutions in that we explicitly use the onset detection functions as an intermediate representation, which may therefore be easily replaced with a different one, e.g., to match the characteristics of a particular audio data source. The results obtained for a database containing annotated onsets for 17 different instruments and ensembles are compared with state-of-the-art solutions.
@article{bwmeta1.element.bwnjournal-article-amcv26i1p203bwm, author = {Bart\l omiej Stasiak and J\k edrzej Mo\'nko and Adam Niewiadomski}, title = {Note onset detection in musical signals via neural-network-based multi-ODF fusion}, journal = {International Journal of Applied Mathematics and Computer Science}, volume = {26}, year = {2016}, pages = {203-213}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv26i1p203bwm} }
Bartłomiej Stasiak; Jędrzej Mońko; Adam Niewiadomski. Note onset detection in musical signals via neural-network-based multi-ODF fusion. International Journal of Applied Mathematics and Computer Science, Tome 26 (2016) pp. 203-213. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv26i1p203bwm/
[000] Alonso, M., Richard, G. and David, B. (2005). Extracting note onsets from musical recordings, Proceedings of the IEEE International Conference on Multimedia and Expo 2005, Amsterdam, The Netherlands, pp. 1-4.
[001] Bartkowiak, M. and Januszkiewicz, Ł. (2012). Hybrid sinusoidal modeling of music with near transparent audio quality, Proceedings of the Joint AES/IEEE Conference NTAV-SPA, Łódź, Poland, pp. 91-96.
[002] Bello, J., Daudet, L., Abdullah, S., Duxbury, C., Davies, M. and Sandler, M. (2005). A tutorial on onset detection in music signals, IEEE Transactions on Speech and Audio Processing 13(5): 1035-1047.
[003] Bello, P. and Sandler, M. (2003). Phase-based note onset detection for music signals, Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing ICASSP, Hong Kong, Vol. 5, pp. 441-444.
[004] Bishop, C.M. (1995). Neural Networks for Pattern Recognition, Oxford University Press, New York, NY. | Zbl 0868.68096
[005] Böck, S., Arzt, A., Krebs, F. and Schedl, M. (2012). Online real-time onset detection with recurrent neural networks, Proceedings of the 15th International Conference on Digital Audio Effects (DAFx 2012), York, UK, pp. 1-4.
[006] Collins, N. (2005). A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions, Proceedings of the AES 118th International Convention, Barcelona, Spain, pp. 28-31.
[007] Daudet, L., Richard, G. and Leveau, P. (2004). Methodology and tools for the evaluation of automatic onset detection algorithms in music, 5th International Conference on Music Information Retrieval, ISMIR 2004, Barcelona, Spain, pp. 72-75.
[008] Davy, M. and Godsill, S.J. (2002). Detection of abrupt spectral changes using support vector machines: An application to audio signal segmentation, IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2002, Orlando, FL, USA, pp. 1313-1316.
[009] Dixon, S. (2006). Onset detection revisited, Proceedings of the International Conference on Digital Audio Effects (DAFx06), Montreal, Quebec, Canada, pp. 133-137.
[010] Duxbury, C., Bello, J., Davies, M. and Sandler, M. (2003). Complex domain onset detection for musical signals, Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), London, UK, pp. 1-4.
[011] Eyben, F., Böck, S., Schuller, B. and Graves, A. (2010). Universal onset detection with bidirectional long shortterm memory, Neural Networks, 11 th International Society for Music Information Retrieval Conference (ISMIR 2010), Utrecht, The Netherlands, pp. 589-594.
[012] Huang, S., Wang, L., Hu, S., Jiang, H. and Xu, B. (2008). Query by humming via multiscale transportation distance in random query occurrence context, IEEE International Conference on Multimedia and Expo, ICME 2008, Hannover, Germany, pp. 1225-1228.
[013] Lacoste, A. and Eck, D. (2007). A supervised classification algorithm for note onset detection, EURASIP Journal of Advanced Signal Processing 2007: 153-153. | Zbl 1168.94467
[014] Laroche, J. (2003). Efficient tempo and beat tracking in audio recordings, Journal of the Audio Engineering Society 51(4): 226-233.
[015] Lee, W.-C. and Kuo, C.-C. (2006). Musical onset detection based on adaptive linear prediction, IEEE International Conference on Multimedia and Expo, ICME 2006, Toronto, Ontario, Canada, pp. 957-960.
[016] Lerch, A. (2012). An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics, Wiley/IEEE Press, Hoboken, NJ.
[017] MIREX (2013). Audio onset detection results in Music Information Retrieval Evaluation eXchange MIREX, 2013, http://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/summary.html.
[018] Peeters, G. (2005). Time variable tempo detection and beat marking, Proceedings of the International Computer Music Conference, ICMC 2005, Barcelona, Spain, pp. 1-4.
[019] Quintela, N.D., Giménez, A.P. and Guijarro, S.T. (2009). A comparison of score-level fusion rules for onset detection in music signals, Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR09, Kobe, Japan, pp. 117-121.
[020] Rabenstein, R. and Petrausch, S. (2008). Block-based physical modeling with applications in musical acoustics, International Journal of Applied Mathematics and Computer Science 18(3): 295-305, DOI: 10.2478/v10006-008-0027-6. | Zbl 1176.93010
[021] Repp, B.H. (1996). Patterns of note onset asynchronies in expressive piano performance, Journal of the Acoustical Society of America 100(6): 3917-3932.
[022] Schlüter, J. and Böck, S. (2014). Improved musical onset detection with convolutional neural networks, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014), Florence, Italy, pp. 6979-6983.
[023] Stasiak, B. (2015). Results repository, http://ics.p.lodz.pl/~basta/ NN_MULTI_ODF_FUSION/Stasiak_OnsetDB.zip.
[024] Tian, M., Fazekas, G., Black, D.A.A. and Sandler, M. (2014). Design and evaluation of onset detectors using different fusion policies, 15th International Society of Music Information Retrieval (ISMIR) Conference, ISMIR 2014, Taipei, Taiwan, pp. 631-636.
[025] Typke, R., Wiering, F. and Veltkamp, R.C. (2007). Transportation distances and human perception of melodic similarity, Musicae Scientiae 11(1): 153-181.
[026] Yin, J., Wang, Y. and Hsu, D. (2005). Digital violin tutor: An integrated system for beginning violin learners, in H. Zhang et al. (Eds.), ACM Multimedia, ACM, New York, NY, pp. 976-985.
[027] Zhang, B. and Wang, Y. (2009). Automatic music transcription using audio-visual fusion for violin practice in home environment, Technical Report TRA7/09, National University of Singapore, Singapore.