Rule-Based Pitch Inference in Optical Music Recognition on Polyphonic Scores using YOLOv12

Authors

DOI:

https://doi.org/10.26905/jtmi.v11i2.16291

Keywords:

Optical Music Recognition, YOLOv12, Convolutional Neural Network, Partitur polifonik, Rule-based Pitch Inference

Abstract

Optical Music Recognition (OMR) faces significant challenges when applied to polyphonic music scores, due to the high symbol density and the overlapping of notes. This study proposes a hybrid method of combining the detection of noteheads using YOLOv12 with rule-based pitch inference, which converts the spatial position of the detected noteheads into accurate pitch information. The dataset used in this study is DeepScoresV2-Dense, which is processed through annotation conversion, image normalization, and staff extraction as a reference to infer the pitch of a note. The YOLOv12 model was trained for 30 epochs using a transfer learning approach, resulting in an mAP50 value of 0.75, a precision of 0.85, and a recall of 0.58 on the validation data. The implementation of rule-based pitch inference successfully achieved a pitch accuracy of 0.87 with an F1 score of 0.87, demonstrating a balance between accuracy and completeness of prediction. This result shows that the integration of YOLOv12 and rule-based pitch inference can be an effective solution for pitch extraction in polyphonic music scores, with potential applications in music information retrieval, digital music score conversion, and an artificial intelligence-based music learning system.

Downloads

Download data is not yet available.

References

[1] Yu, P., & Chen, H. (2024). Deep Multilevel Cascade Residual Recurrent Framework (MCRR) For Sheet Music Recognition. IEEE Access, 12, 6941–6960. Https://Doi.Org/10.1109/ACCESS.2024.3350880

[2] Calvo-Zaragoza, J., Hajic, J., & Pacha, A. (2020). Understanding Optical Music Recognition. ACM Computing Surveys (CSUR), 53(4). Https://Doi.Org/10.1145/3397499

[3] Ríos-Vila, A., Rizo, D., Iñesta, J. M., & Calvo-Zaragoza, J. (2023). End-To-End Optical Music Recognition For Pianoform Sheet Music. International Journal On Document Analysis And Recognition, 26(3), 347–362. Https://Doi.Org/10.1007/S10032-023-00432-Z

[4] Simonetta, F., Mondal Luca Andrea Ludovico Stavros Ntalampiras, R., & Puccini, G. (2024). Optical Music Recognition In Manuscripts From The Ricordi Archive. AM ’24, September 18–20, 2024, 260–269. Https://Doi.Org/10.1145/3678299.3678324

[5] Shatri, E., & Fazekas, G. (2020). Optical Music Recognition: State Of The Art And Major Challenges. Https://Arxiv.Org/Abs/2006.07885v2

[6] S. N. Budiman, S. Lestanti, H. Yuana, And B. N. Awwalin, “Jurnal Teknologi Dan Manajemen Informatika SIBI (Sistem Bahasa Isyarat Indonesia) Berbasis Machine Learning Dan Computer Vision Untuk Membantu Komunikasi Tuna Rungu Dan Tuna Wicara,” Jurnal Teknologi Dan Manajemen Informatika, Vol. 9, No. 2, Pp. 119–128, Dec. 2023, Accessed: Nov. 18, 2025. [Online]. Available: Https://Doi.Org/10.26905/Jtmi.V9i2.10993

[7] R. Nahak Et Al., “Jurnal Teknologi Dan Manajemen Informatika Klasifikasi Jenis Rumah Adat Malaka Menggunakan Metode Convulational Neural Network (CNN) Article Info ABSTRACT,” Jurnal Teknologi Dan Manajemen Informatika, Vol. 9, No. 2, Pp. 91–98, Dec. 2023, Accessed: Nov. 18, 2025. [Online]. Available: Https://Doi.Org/10.26905/Jtmi.V9i2.10352

[8] Li, Y., Liu, H., Jin, Q., Cai, M., & Li, P. (2023). Tromr:Transformer-Based Polyphonic Optical Music Recognition. ICASSP, IEEE International Conference On Acoustics, Speech And Signal Processing - Proceedings, 2023-June. Https://Doi.Org/10.1109/ICASSP49357.2023.10096055

[9] Kheng, E. H., Liew, C. P., Lan, T., & Tan, K. G. (2024). Advancing Handwritten Musical Notation Recognition Using Deep Learning: A Convolutional Neural Network-Based Approach With Improved Accuracy. International Journal Of Pattern Recognition And Artificial Intelligence, 38(3). Https://Doi.Org/10.1142/S0218001424520074

[10] Rafliansyah, R. H., Rahmat, B., & Putra, C. A. (2024). Klasifikasi Suara Instrumen Musik Tiup Menggunakan Metode Convolutional Neural Network. Merkurius : Jurnal Riset Sistem Informasi Dan Teknik Informatika, 2(4), 01–09. Https://Doi.Org/10.61132/MERKURIUS.V2I4.119

[11] N. D. Hendrawan, R. Kolandaisamy, And A. History, “Jurnal Teknologi Dan Manajemen Informatika A Comparative Study Of Yolov8 And YOLO-NAS Performance In Human Detection Image Article Info ABSTRACT,” Jurnal Teknologi Dan Manajemen Informatika, Vol. 9, No. 2, Pp. 191–201, Dec. 2023, Accessed: Nov. 17, 2025. [Online]. Available: Https://Doi.Org/10.26905/Jtmi.V9i2.12192

[12] Wairata, C. R., Swedia, E. R., & Cahyanti, M. (2021). PENGKLASIFIKASIAN GENRE MUSIK INDONESIA MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK. Sebatik, 25(1), 255–261. Https://Doi.Org/10.46984/SEBATIK.V25I1.1286

[13] Romão, G. H., Lara, H. S., & Brito, J. N. (2024). Testing Yolov8’s Efficacy As A Pitch And Duration Detector Across Digitally Written Monophonic Music Scores. OBSERVATÓRIO DE LA ECONOMÍA LATINOAMERICANA, 22(9), E6776–E6776. Https://Doi.Org/10.55905/OELV22N9-133

[14] Sapkota, R., Ahmed, D., & Karkee, M. (2024). Comparing Yolov8 And Mask R-CNN For Instance Segmentation In Complex Orchard Environments. Artificial Intelligence In Agriculture, 13, 84–99. Https://Doi.Org/10.1016/J.AIIA.2024.07.001

[15] Ma, J., Zhou, Y., Zhou, Z., Zhang, Y., & He, L. (2025). Toward Smart Ocean Monitoring: Real-Time Detection Of Marine Litter Using Yolov12 In Support Of Pollution Mitigation. Marine Pollution Bulletin, 217, 118136. Https://Doi.Org/10.1016/J.MARPOLBUL.2025.118136

[16] Tuggener, L., Satyawan, Y. P., Pacha, A., Schmidhuber, J., & Stadelmann, T. (2020). The Deepscoresv2 Dataset And Benchmark For Music Object Detection. Proceedings - International Conference On Pattern Recognition, 9188–9195. Https://Doi.Org/10.1109/ICPR48806.2021.9412290

[17] E. Tanuwijaya And C. Fatichah, “Penandaan Otomatis Tempat Parkir Menggunakan YOLO Untuk Mendeteksi Ketersediaan Tempat Parkir Mobil Pada Video CCTV,” BRILIANT: Jurnal Riset Dan Konseptual, Vol. 5, No. 1, 2020, Doi: 10.28926/Briliant.

[18] Genfang Chen, Liyin Zhang, Wenjun Zhang, And Qiuqiu Wang, “Detecting The Staff-Lines Of Musical Score With Hough Transform And Mathematical Morphology,” 2010 International Conference On Multimedia Technology, 2010, Doi: 10.1109/ICMULT.2010.5631269.

[19] F. F. De Vega, J. Alvarado, And J. V. Cortez, “Optical Music Recognition And Deep Learning: An Application To 4-Part Harmony,” 2022 IEEE Congress On Evolutionary Computation, CEC 2022 - Conference Proceedings, 2022, Doi: 10.1109/CEC55065.2022.9870357.

[20] X. Yin, Z. Zhao, And L. Weng, “MAS-YOLO: A Lightweight Detection Algorithm For PCB Defect Detection Based On Improved Yolov12,” Applied Sciences (Switzerland), Vol. 15, No. 11, Jun. 2025, Doi: 10.3390/App15116238.

Downloads

Published

17-12-2025