now publishers - Scene Understanding by Fused Hu’s Invariant Moments and Deep Learning Features

APSIPA Transactions on Signal and Information Processing > Vol 14 > Issue 1

Scene Understanding by Fused Hu’s Invariant Moments and Deep Learning Features

Michael Nachipyangu, Northwestern Polytechnical University, China, michael.nachipyangu@mail.nwpu.edu.cn , Jiangbin Zheng, Northwestern Polytechnical University, China

Suggested Citation

Michael Nachipyangu and Jiangbin Zheng (2025), "Scene Understanding by Fused Hu’s Invariant Moments and Deep Learning Features", APSIPA Transactions on Signal and Information Processing: Vol. 14: No. 1, e21. http://dx.doi.org/10.1561/116.20250007

Publication Date: 13 Aug 2025

Subjects

Object and scene recognition, Classification and prediction, Deep learning

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 398 times

In this article:

Abstract

Convolutional neural networks (CNN) are widely used in the recognition and classification of scene images due to their effectiveness in this task. However, their applicability is not quite as favorable when used with variations of parameters such as rotation, scaling, and translation in input data. To overcome this drawback, this study presents a feature fusion technique that combines Hu moments with deep learning features derived from the CNN model. Hu’s moments of an image are statistical values obtained based on the intensities of the image pixels that are invariant to geometric transformations. These moments are then combined with the features of the fully connected layer of the CNN model, making the proposed method more accurate and robust. The study also utilizes data augmentation, specifically geometrical transformations such as rotating, scaling, flipping, and translation to balance class image distribution in training datasets and reduce interclass bias resulting from the imbalance in number of images within different classes. The fused feature representation was evaluated on three benchmark datasets: MIT67, AID and Scene15. Detailed experiments with different CNN models were conducted, and Inception- ResNetV2 as deep feature extractor combined with Hu Moments demonstrated the effectiveness of the proposed approach which delivers significant improvements in accuracy scores, Scene15: 96.4%, AID: 94.1% and MIT67: 87.1%. This result presents a novel avenue approach for enhancing the resilience and accuracy of Scene Understanding.

DOI:10.1561/116.20250007

Introduction
Related Work
Proposed Method
Experiments and Results Discussion
Conclusion
References

Scene Understanding by Fused Hu’s Invariant Moments and Deep Learning Features

Share

Journal details

Abstract