Detection of Militia Object in Libya by Using YOLO Transfer Learning

Yosi Kristian, Hatem Alsadeg Ali Salim, Endang Setyati


Humans can recognize and classify shapes, names, and provide responses to object that are received by visually quickly and accurately. More importantly, it is expected that the system created is able to help provide response in all tasks and time, for example when driving, walking in the crowd even when patrolling as a member of the military on dangerous terrain.This has become a problem in the system used on the battlefield. In the proposed system, the object detection model must be able to sort out the objects of armed humans (militia) with unarmed human objects. To overcome the problem the author uses the YOLO transfer learning algorithm which currently has the third version. It is stated that YOLOv3 has very extreme speed and accuracy. In mean Average Precision (mAP) obtained by 0.5 IOU, YOLOv3 is equivalent to 4x faster than Focal Loss. Moreover, YOLOv3 also offers optimal speed and accuracy simply by changing the size of the model, without the need for retraining.




YOLO; Darknet; Tensorflow; Militia

Full Text:



J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “Detection = Classification + Localization,” 2017.

J. Redmon and A. Angelova, “Real-time grasp detection using convolutional neural networks,” Proc. - IEEE Int. Conf. Robot. Autom., vol. 2015-June, no. June, pp. 1316–1322, 2015, doi: 10.1109/ICRA.2015.7139361.

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” 2018.

O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015, doi: 10.1007/s11263-015-0816-y.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “Overfeat: Integrated recognition, localization and detection using convolutional networks,” 2nd Int. Conf. Learn. Represent. ICLR 2014 - Conf. Track Proc., 2014.

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net : ImageNet Classification Using Binary,” Eccv2016, pp. 1–17, 2016.

S. Song and J. Xiao, “Deep sliding shapes for amodal 3D object detection in RGB-D images,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-December, pp. 808–816, 2016, doi: 10.1109/CVPR.2016.94.

J. Li, Y. Wu, J. Zhao, L. Guan, C. Ye, and T. Yang, “Pedestrian detection with dilated convolution, region proposal network and boosted decision trees,” Proc. Int. Jt. Conf. Neural Networks, vol. 2017-May, pp. 4052–4057, 2017, doi: 10.1109/IJCNN.2017.7966367.

D. Forsyth, “Non-Maximum-Suppression,” Computer (Long. Beach. Calif)., vol. 47, no. 2, pp. 6–7, 2009, doi: 10.1109/MC.2014.42.

Y. Jia et al., “Caffe: Convolutional architecture for fast feature embedding,” MM 2014 - Proc. 2014 ACM Conf. Multimed., pp. 675–678, 2014, doi: 10.1145/2647868.2654889.

Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 886–893, 2015.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi: 10.1109/CVPR.2014.81.

T. F. Gonzalez, “Handbook of approximation algorithms and metaheuristics,” Handb. Approx. Algorithms Metaheuristics, pp. 1–1432, 2007, doi: 10.1201/9781420010749.

P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning to segment object candidates,” Adv. Neural Inf. Process. Syst., vol. 2015-January, pp. 1990–1998, 2015.

J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective Search for Object Recognition,” Proc. - 13th IEEE Int. Conf. Autom. Face Gesture Recognition, FG 2018, pp. 357–364, 2018, doi: 10.1109/FG.2018.00058.

R. Girshick, “Fast R-CNN,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, pp. 1440–1448, 2015, doi: 10.1109/ICCV.2015.169.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 6517–6525, 2017, doi: 10.1109/CVPR.2017.690.



  • There are currently no refbacks.

Copyright (c)

ndexing by

SINTA - Science and Technology Index

Index Copernicus International (ICI)





Jurnal Teknologi dan Manajemen Informatika 

Fakultas Teknologi Informasi
University of Merdeka Malang


Jl. Terusan Raya Dieng No. 62-64, Malang, Indonesia, 65146
(0341) 566462

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.