logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 7 : 1110-1120(2020) https://doi.org/10.1360/SSI-2020-0046

Mask-wearing recognition in the wild

More info
  • ReceivedMar 6, 2020
  • AcceptedApr 16, 2020
  • PublishedJun 23, 2020

Abstract


References

[1] Zou Z X, Shi Z W, Guo Y H, et al. Object detection in 20 years: a survey. 2019,. arXiv Google Scholar

[2] Ren S, He K, Girshick R. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137-1149 CrossRef Google Scholar

[3] Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of Conference on Advances in Nerual Information Processing Systems, Barcelona, 2016. 379--387. Google Scholar

[4] Lin T, Dollar P, Girshick R B, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 2017. 936--944. Google Scholar

[5] Redmon J, Farhadi A. YOLOv3: an incremental improvement. 2018,. arXiv Google Scholar

[6] Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, Amsterdam, 2016. 21--37. Google Scholar

[7] Lin T Y, Goyal P, Girshick R. Focal Loss for Dense Object Detection. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 318-327 CrossRef Google Scholar

[8] Zhang K, Zhang Z, Li Z. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Process Lett, 2016, 23: 1499-1503 CrossRef ADS arXiv Google Scholar

[9] Wang H, Li Z F, Ji X, et al. Face R-CNN. 2017,. arXiv Google Scholar

[10] Najibi M, Samangouei P, Chellappa R, et al. SSH: single stage headless face detector. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), Venice, 2017. 4885--4894. Google Scholar

[11] Wang J F, Yuan Y, Yu G. Face attention network: an effective face detector for occluded faces. 2017,. arXiv Google Scholar

[12] Tang X, Du D K, He Z, et al. Pyramidbox: a context-assisted single shot face detector. In: Proceedings of European Conference on Computer Vision (ECCV), Munich, 2018. 797--813. Google Scholar

[13] Pang Y W, Xie J, Khan M H, et al. Mask-guided attention network for occluded pedestrian detection. In: Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 2019. 4966--4974. Google Scholar

[14] Xie J, Pang Y W, Cholakkal H, et al. PSC-Net: learning part spatial co-occurence for occluded pedestrian detection. 2020,. arXiv Google Scholar

[15] Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks. In: Proceedings of International Conference on Neural Information Processing System, Lake Tahoe, 2012. 1097--1105. Google Scholar

[16] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, 2015. Google Scholar

[17] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 2015. Google Scholar

[18] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016. 770--778. Google Scholar

[19] Tian W X, Wang Z X, Shen H F, et al. Learning better features for face detection with feature fusion and segmentation supervision. 2018,. arXiv Google Scholar

[20] Qian Y, Ding X, Liu T. Identification method of user's travel consumption intention in chatting robot. Sci Sin-Inf, 2017, 47: 997-1007 CrossRef Google Scholar

  • Figure 1

    (Color online) Block diagram of face mask recognition

  • Figure 2

    (Color online) Architecture of DFS face detection algorithm

  • Figure 3

    (Color online) Structure diagram of mask recognition model based on Resnet50

  • Table 1   Comparison of DFS and other algorithms on WIDER FACE validation set
    Algorithms Easy Medium Hard
    MTCNN 84.8 82.5 59.8
    Face R-CNN 93.7 92.1 83.1
    SSH 93.1 92.1 84.5
    FAN 95.3 94.2 88.8
    PyramidBox 96.1 95.0 88.9
    DFS 96.9 95.9 91.2
  • Table 2   Experimental data sources and quantity distributions
    Data source Mask NoMask Total (thousand)
    Collected online 5 30 35
    In-car 40 65 105
    Phone 40 10 50
    Recorded 2 3 5
    Total 87 108 195
  • Table 3   Face mask recognition results on mobile phone image
    Test set Target Result
    NoMask Mask
    Designated drive (100 Mask + 105 NoMask) Precision (%) 100.00 98.99
    Recall (%) 99.06 100.00
    Accuracy (%) 99.51
    Designated drive (3247 Mask + 1111 NoMask) Precision (%) 93.33 98.63
    Recall (%) 96.03 97.66
    Accuracy (%) 97.24
    Car hailing driver (1455 Mask + 25 NoMask) Precision (%) 96.15 100.00
    Recall (%) 100.00 99.73
    Accuracy (%) 99.73
  • Table 4   Face mask recognition results based on vehicle monitoring image
    Test set Target Result
    NoMask Mask
    Test 2k (1k Mask + 1k NoMask) Precision (%) 99.10 100.00
    Recall (%) 100.00 99.11
    Accuracy (%) 99.55
    Test 1.5k (little Mask) Precision (%) 99.71 89.83
    Recall (%) 98.86 97.25
    Accuracy (%) 98.71
    Night bad case (263 Mask) Precision (%) NAN 100.00
    Recall (%) NAN 98.46
    Accuracy (%) 98.46
    Day 1.1k (117 Mask + 1037 NoMask) Precision (%) 96.72 70.34
    Recall (%) 96.62 70.94
    Accuracy (%) 94.02
    Night 3.6k (419 Mask + 3246 NoMask) Precision (%) 96.27 76.15
    Recall (%) 97.13 70.88
    Accuracy (%) 94.13
  • Table 5   Comparative experiment results on mobile phone image
    Test set Target Data collected online In-car data Attention mechanism
    NoMask Mask NoMask Mask NoMask Mask
    Designated drive (100 Mask + 105 NoMask) Precision (%) 91.70 100.00 100.00 98.99 100.00 98.99
    Recall (%) 73.30 65.00 99.06 100.00 99.06 100.00
    Accuracy (%) 96.60 99.51 99.51
    Designated drive (3247 Mask + 1111 NoMask) Precision (%) 77.49 99.46 91.26 99.02 93.33 98.63
    Recall (%) 98.56 90.21 97.20 96.83 96.03 97.66
    Accuracy (%) 92.34 96.92 97.24
  • Table 6   Comparative experiment results on vehicle monitoring image
    Test set Target Data collected online In-car data Attention mechanism
    NoMask Mask NoMask Mask NoMask Mask
    Test 2k (1k Mask + 1k NoMask) Precision (%) 98.31 100.00 99.10 100.00 99.10 100.00
    Recall (%) 100.00 98.31 100.00 99.11 100.00 99.11
    Accuracy (%) 99.15 99.55 99.55
    Test 1.5k (little Mask) Precision (%) 99.14 84.03 99.62 88.98 99.71 89.83
    Recall (%) 98.20 91.74 98.77 96.33 98.86 97.25
    Accuracy (%) 97.59 98.54 98.71
    Night bad case (263 Mask) Precision (%) NAN 100.00 NAN 100.00 NAN 100.00
    Recall (%) NAN 61.24 NAN 98.06 NAN 98.46
    Accuracy (%) 61.24 98.06 98.46
  • Table 7   Comparative experiment results on difficult vehicle monitoring samples
    Test set Target In-car data Attention mechanism
    NoMask Mask NoMask Mask
    Day 1.1k (117 Mask + 1037 NoMask) Precision (%) 96.76 62.69 96.72 70.34
    Recall (%) 95.18 71.79 96.62 70.94
    Accuracy (%) 92.18 94.02
    Night 3.6k (419 Mask + 3246 NoMask) Precision (%) 96.63 63.93 96.27 76.15
    Recall (%) 94.58 74.46 97.13 70.88
    Accuracy (%) 92.28 94.13