This is published under the terms of CC BY-NC.
Downloaded: 754 times
This paper proposes a new PPE detection method based on the attention-based YOLOv7 and human pose estimation. The proposed attention module consists of the concatenation of CBAM (Convolutional Block Attention Module) and the SE (Squeeze-and-Excitation) block. This attention module is placed immediately before the detection layer of the YOLOv7 architecture. CBAM derives spatial and channel attention of extracted features from a YOLOv7 backbone. The attention weights prioritize the relevant features in both the spatial and channel domains to be utilized for PPE detection. The SE block refines the attention weights obtained from CBAM before feeding weighted features to the detection layer. Human pose estimation based on YOLO-pose is employed to remove some false positives of PPE detection. The proposed method detects human body parts and assigns key points to human body parts. The essential reference points are computed from the derived vital points. The detection targets far from the reference points will be regarded as false positives and removed. From experimental results, our proposed PPE detection can increase mAP by up to 8.5% at threshold 0.5, 8.8% at threshold 0.5 to 0.95, and reduce false positive detection by 22% on deployment when compared to the original YOLOv7 model.