APSIPA Transactions on Signal and Information Processing > Vol 14 > Issue 1

PPMamba: A Pyramid Pooling Local Auxiliary SSM-based Model for Remote Sensing Image Semantic Segmentation

Yin Hu, The Chinese University of Hong Kong, China, Xianping Ma, The Chinese University of Hong Kong, China, Jialu Sui, The Chinese University of Hong Kong, China, Man-On Pun, The Chinese University of Hong Kong, China, simonpun@cuhk.edu.cn
 
Suggested Citation
Yin Hu, Xianping Ma, Jialu Sui and Man-On Pun (2025), "PPMamba: A Pyramid Pooling Local Auxiliary SSM-based Model for Remote Sensing Image Semantic Segmentation", APSIPA Transactions on Signal and Information Processing: Vol. 14: No. 1, e22. http://dx.doi.org/10.1561/116.20250012

Publication Date: 13 Aug 2025
© 2025 Y. Hu, X. Ma, J. Sui and M.-O. Pun
 
Subjects
Feature detection and selection,  Image and video retrieval,  Sensors and sensing,  Image and video processing,  Classification and prediction,  Deep learning
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 25 times

In this article:
Introduction 
Related Work 
Methodology 
Experiments 
Conclusion 
References 

Abstract

Semantic segmentation is a vital task in the field of remote sensing (RS). However, conventional convolutional neural network (CNN) and transformer-based models face limitations in capturing long-range dependencies or are often computationally intensive. Recently, an advanced state space model (SSM), namely Mamba, was introduced, offering linear computational complexity while effectively establishing long-distance dependencies. Despite their advantages, Mamba-based methods encounter challenges in preserving local semantic information. To cope with these challenges, this paper proposes a novel network called Pyramid Pooling Mamba (PPMamba), which integrates CNN and Mamba for RS semantic segmentation tasks. The core structure of PPMamba, the Pyramid Pooling-State Space Model (PP-SSM) block, combines a local auxiliary mechanism with an omnidirectional state space model (OSS) that selectively scans feature maps from eight directions, capturing comprehensive feature information. Additionally, the auxiliary mechanism includes pyramid-shaped convolutional branches designed to extract features at multiple scales. Extensive experiments on three widely-used datasets, ISPRS Vaihingen, LoveDA Urban and WHU Buildings, demonstrate that PPMamba achieves competitive performance compared to state-of- the-art models. The source code will be made available at https://github.com/KyotoSakura/PPMamba/.

DOI:10.1561/116.20250012