logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 5 : 692-703(2020) https://doi.org/10.1360/N112019-00034

Obstacle visual sensing based on deep learning for low-altitude small unmanned aerial vehicles

More info
  • ReceivedFeb 15, 2019
  • AcceptedMay 5, 2019
  • PublishedApr 16, 2020

Abstract


Funded by

国家自然科学基金(61175084,61673042)


References

[1] Fasano G, Accardo D, Moccia A. Multi-Sensor-Based Fully Autonomous Non-Cooperative Collision Avoidance System for Unmanned Air Vehicles. J Aerospace Computing Inf Communication, 2008, 5: 338-360 CrossRef Google Scholar

[2] Fränken D, Hupper A. Unified tracking and fusion for airborne collision avoidance using log-polar coordinates. In: Proceedings of the 15th IEEE International Conference on Information Fusion, Singapore, 2012. 1246--1253. Google Scholar

[3] Mueller M W, D'Andrea R. Critical subsystem failure mitigation in an indoor UAV testbed. In: Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, 2012. 780--785. Google Scholar

[4] Ross S, Melik-Barkhudarov N, Shankar K S, et al. Learning monocular reactive UAV control in cluttered natural environments. In: Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, 2013. 1765--1772. Google Scholar

[5] Shuai C, Wang H, Zhang W, et al. Binocular vision perception and obstacle avoidance of visual simulation system for power lines inspection with UAV. In: Proceedings of the 36th Chinese Control Conference, Dalian, 2017. 10480--10485. Google Scholar

[6] Wang H L, Wu J F, Yao P. UAV three-dimensional path planning based on Interfered Fluid Dynamical System: Methodology and application. Unmanned Syst Technol, 2018, 1: 72--82. Google Scholar

[7] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 580--587. Google Scholar

[8] Ren S, He K, Girshick R. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137-1149 CrossRef PubMed Google Scholar

[9] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Seattle, 2016. 779--788. Google Scholar

[10] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector. In: Proceedings of the 2016 European Conference on Computer Vision, Springer, 2016. 21--37. Google Scholar

[11] Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 7263--7271. Google Scholar

[12] Henriques J F, Caseiro R, Martins P. High-Speed Tracking with Kernelized Correlation Filters.. IEEE Trans Pattern Anal Mach Intell, 2015, 37: 583-596 CrossRef PubMed Google Scholar

[13] Tao C B, Qiao L, Sun Y F, et al. Stereo matching algorithm of six rotor UAV based on binocular vision. Laser and Infrared, 2018, 48: 1181--1187. Google Scholar

[14] Gao H W. Computer Binocular Stereo Vision. Beijing: Electronic Industry Press, 2012: 132--135. Google Scholar

[15] Wang Z R, Guo X K, Zhao G. Depth image information fusion and three-dimensional reconstruction method of double binocular stereo vision. Laser and Infrared, 2019, 49: 246--250. Google Scholar

[16] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 2704--2713. Google Scholar

[17] Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 6848--6856. Google Scholar

  • Figure 1

    Block diagram of the vision-based real-time obstacle perception method

  • Figure 2

    (Color online) Samples in the data set

  • Figure 3

    (Color online) Target detection and recognition results using YOLOv2

  • Figure 4

    (Color online) 3D reconstruction based on binocular stereo vision

  • Figure 5

    Obstacle extraction using information fusion method

  • Figure 6

    Perception of the physical prototype. (a) Real-time detection result using YOLOv2 and KCF tracking algorithm; (b) point cloud for 3D environment reconstruction using binocular stereo vision

  • Figure 7

    (Color online) Detection and tracking results for the physical object using the proposed algorithm. (a) Image at the 1st frame; (b) image at the 60th frame; (c) image at the 120th frame

  • Table 1   Perception results of the physical prototype in indoor environments
    Center coordinate $X$ (m) Center coordinate $Y$ (m) Center coordinate $Z$ (m) Width (m) Height (m)
    Group 1 Real value 0.000 0.000 1.500 0.370 0.370
    Measured value $-$0.012 0.021 1.493 0.374 0.374
    Error $-$0.012 0.021 $-$0.007 0.004 0.004
    Group 2 Real value 0.000 0.400 1.980 0.370 0.370
    Measured value 0.009 0.443 1.981 0.374 0.374
    Error 0.009 0.043 0.001 0.004 0.004
    Group 3 Real value $-$0.400 0.400 2.500 0.370 0.370
    Measured value $-$0.378 0.433 2.550 0.374 0.382
    Error 0.022 0.033 0.050 0.004 0.012
    Group 4 Real value $-$0.400 0.400 3.000 0.370 0.370
    Measured value $-$0.447 0.447 2.987 0.402 0.382
    Error $-$0.047 0.047 0.013 0.032 0.012
    Group 5 Real value 0.000 0.400 4.000 0.370 0.370
    Measured value 0.037 0.442 3.951 0.397 0.420
    Error 0.037 0.042 0.049 0.027 0.050
  • Table 2   Statistical results of position error mean values and standard deviations$^{\rm~a)}$
    $X$-EM (cm) $X$-SD (cm) $Y$-EM (cm) $Y$-SD (cm) $Z$-EM (cm) $Z$-SD (cm)
    Group 1 1.4 0.5 1.5 0.4 1.1 0.3
    Group 2 1.4 0.4 1.3 0.4 1.1 0.4
    Group 3 1.8 0.6 2.2 0.4 2.2 0.1
    Group 4 2.3 0.6 3.2 1.0 3.8 0.5
    Group 5 2.6 0.6 2.8 0.6 5.4 0.9

    a$X/Y/Z$-EM represents $X/Y/Z$-axis position error mean value and $X/Y/Z$-SD represents $X/Y/Z$-axis standard deviation.