Feature Pyramid Full Granularity Attention Network for Object Detection in Remote Sensing Imagery

Liu, Chang; Qi, Xiao; Yin, Hang; Song, Bowei; Li, Ke; Shen, Fei

doi:10.1007/978-981-97-5609-4_26

Chang Liu ORCID: orcid.org/0009-0001-8819-0229¹⁰,
Xiao Qi ORCID: orcid.org/0009-0001-7099-1685¹⁰,
Hang Yin ORCID: orcid.org/0009-0009-2090-7674¹⁰,
Bowei Song¹⁰,
Ke Li ORCID: orcid.org/0009-0000-9535-0545¹⁰ &
…
Fei Shen¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14871))

Included in the following conference series:

International Conference on Intelligent Computing

513 Accesses

Abstract

With the rapid advancement of deep learning, particularly the emergence of attention mechanisms applied to convolutional neural networks (CNNs), object detection in high-resolution remote sensing images has seen significant progress. However, due to the CNNs’ inability to capture long-range dependencies and the high computational cost of the attention mechanism, object detection in remote sensing images remains a challenging task. To address these issues, this paper introduces a novel feature pyramid full granularity attention module (FPFGAM) designed to learn long-range dependencies, dynamically attend to strongly correlated features, and reduce GPU memory overhead. Initially, we perform adaptive filtering of feature regions at the coarse-grained level. This process reduces the computational burden caused by weakly correlated features. Subsequently, we perform fine-grained pixel-level queries on several strongly correlated regions to enhance long-range dependent feature learning. We propose a feature pyramid full granularity attention network (FPFGANet) by embedding the feature pyramid full granularity attention module into the backbone network ResNet50 and the feature pyramid network (FPN). FPFGAM can be easily inserted into different layers to improve object detection accuracy in remote sensing images. Finally, we evaluate our method on three commonly used public remote sensing object detection datasets: NWPU VHR-10 and DIOR. The empirical results confirm the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

CHF34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: CHF 24.95; Price includes VAT (Switzerland)

eBook: CHF 78.00; Price excludes VAT (Switzerland)

Softcover Book: CHF 97.00; Price excludes VAT (Switzerland)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Self-attention module and FPN-based remote sensing image _target detection

Article 17 November 2021

AgBFPN: Attention Guided Bidirectional Feature Pyramid Network for Object Detection

A two-way dense feature pyramid networks for object detection of remote sensing images

Article 23 June 2023

References

Wang, Q., Gao, J., Yuan, Y.: Embedding structured contour and location prior in siamesed fully convolutional networks for road detection. IEEE Trans. Intell. Transp. Syst. 19(1), 230–241 (2017)
Article Google Scholar
Hu, J., Huang, Z., Shen, F., He, D., Xian, Q.: A bag of tricks for fine-grained roof extraction. In: IGARSS 2023–2023 IEEE International Geoscience and Remote Sensing Symposium. IEEE (2023)
Google Scholar
Weng, W., Wei, M., Ren, J., Shen, F.: Enhancing aerial object detection with selective frequency interaction network. IEEE Trans. Artif. Intell. 1(01), 1–12 (2024)
Google Scholar
Fu, X., Shen, F., Du, X., Li, Z.: Bag of tricks for “vision meet alage” object detectionchallenge. In: 2022 6th International Conference on Universal Village (UV) , pp. 1–4. IEEE (2022)
Google Scholar
Shen, F., et al.: An efficient multiresolution network for vehicle reidentification. IEEE Internet Things J. 9(11), 9049–9059 (2021)
Article Google Scholar
Shi, G., Zhang, J., Liu, J., Zhang, C., Zhou, C., Yang, S.: Global context-augmented objection detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 59(12), 10604–10617 (2020)
Article Google Scholar
Qiao, F.S., Wang, X., Wang, R., Cao, F., Zhao, S., Li, C.: A novel multi-frequency coordinated module for SAR ship detection. In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 804–811. IEEE (2022)
Google Scholar
Li, M., Wei, M., He, X., Shen, F.: Enhancing part features via contrastive attention module for vehicle re-identification. In: 2022 IEEE International Conference on Image Processing (ICIP) , pp. 1816–1820. IEEE (2022)
Google Scholar
Li, Y., Huang, Q., Pei, X., Chen, Y., Jiao, L., Shang, R.: Cross-layer attention network for small object detection in remote sensing imagery. IEEE J. Select. Topics Appl. Earth Observ. Remote Sens. 14, 2148–2161 (2021)
Article Google Scholar
Qin, Z., Zhang, P., Wu, F., Li, X.: Fcanet: frequency channel attention networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 783–792 (2021)
Google Scholar
Tian, Z., Zhan, R., Hu, J., Wang, W., He, Z., Zhuang, Z.: Generating anchor boxes based on attention mechanism for object detection in remote sensing images. Remote Sensing 12(15), 2416 (2020)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Dong, X., et al.: Cswin transformer: a general vision transformer backbone with cross-shaped windows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12124–12134 (2022)
Google Scholar
Tu, Z., et al.: Maxvit: Multi-axis vision transformer. In: European Conference on Computer Vision, pp. 459–479. Springer (2022)
Google Scholar
Wang, W., et al.: Crossformer++: a versatile vision transformer hinging on cross-scale attention. IEEE Trans. Pattern Anal. Mach. Intell. 46(5), 3123–3136 (2023)
Google Scholar
Dai, X., Chen, Y., Yang, J., Zhang, P., Yuan, L., Zhang, L.: Dynamic detr: end-to-end object detection with dynamic attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2988–2997 (2021)
Google Scholar
Chen, C., Yu, J., Ling, Q.: Sparse attention block: aggregating contextual information for object detection. Pattern Recogn. 124, 108418 (2022)
Article Google Scholar
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Shen, F., Shu, X., Du, X., Tang, J.: Pedestrian-specific bipartite-aware similarity learning for text-based person retrieval. In: Proceedings of the 31th ACM International Conference on Multimedia (2023)
Google Scholar
Shen, F., Zhu, J., Zhu, X., Xie, Y., Huang, J.: Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 23(7), 8793–8804 (2021)
Article Google Scholar
Chen, C., Gong, W., Chen, Y., Li, W.: Object detection in remote sensing images based on a scene-contextual feature pyramid network. Remote Sensing 11(3), 339 (2019)
Article Google Scholar
Huang, W., Li, G., Chen, Q., Ju, M., Qu, J.: Cf2pn: a cross-scale feature fusion pyramidnetwork based remote sensing _target detection. Remote Sensing 13(5), 847 (2021)
Article Google Scholar
Yang, X., et al.: Scrdet: towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8232–8241 (2019)
Google Scholar
Xie, W., Lei, J., Fang, S., Li, Y., Jia, X., Li, M.: Dual feature extraction network for hyperspectral image analysis. Pattern Recogn. 118, 107992 (2021)
Article Google Scholar
Xie, W., Lei, J., Cui, Y., Li, Y., Du, Q.: Hyperspectral pansharpening with deep priors. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1529–1543 (2019)
Article MathSciNet Google Scholar
Li, K., Wan, G., Cheng, G., Meng, L., Han, J.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photogramm. Remote. Sens. 159, 296–307 (2020)
Article Google Scholar
Ding, J., et al.: Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7778–7796 (2021)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Google Scholar
Pang, J., et al.: Towards balanced learning for instance recognition. Int. J. Comput. Vision 129, 1376–1393 (2021)
Article Google Scholar
Chen, J., Luo, B., Wu, Q., Chen, J., Peng, X.: Overlap sampler for region-based object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 767–775 (2020)
Google Scholar
Deng, C., Wang, M., Liu, L., Liu, Y., Jiang, Y.: Extended feature pyramid network for small object detection. IEEE Trans. Multim. 24, 1968–1979 (2021)
Article Google Scholar
Kong, T., Sun, F., Tan, C., Liu, H., Huang, W.: Deep feature pyramid reconfiguration for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 169–185 (2018)
Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Google Scholar
Guo, C., Fan, B., Zhang, Q., Xiang, S., Pan, C.: Augfpn: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12595–12604 (2020)
Google Scholar
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 10781–10790 (2020)
Google Scholar
Luo, Y., et al.: CE-FPN: enhancing channel information for object detection. Multimedia Tools Appl. 81(21), 30685–30704 (2022)
Google Scholar
Shen, F., Xie, Y., Zhu, J., Zhu, X., Zeng, H.: Git: graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. 32, 1039–1051 (2023)
Google Scholar
Huang, S., Lu, Z., Cheng, R., He, C.: FAPN: feature-aligned pyramid network for dense image prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 864–873 (2021)
Google Scholar
Cao, J., Chen, Q., Guo, J., Shi, R.: Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475 (2020)
Hu, M., Li, Y., Fang, L., Wang, S.: A2-fpn: attention aggregation based feature pyramid network for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15343–15352 (2021)
Google Scholar
Zhu, L., Wang, X., Ke, Z., Zhang, W., Lau, R.W.: Biformer: vision transformer with bilevel routing attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10323–10333 (2023)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training dataefficient image transformers and distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Beal, J., Kim, E., Tzeng, E., Park, D.H., Zhai, A., Kislyuk, D.: Toward transformer-based object detection. arXiv preprint arXiv:2012.09958 (2020)
Shen, F., Du, X., Zhang, L., Tang, J.: Triplet contrastive learning for unsupervised vehicle re-identification. arXiv preprint arXiv:2301.09498 (2023)
Gao, P., Lu, J., Li, H., Mottaghi, R., Kembhavi, A.: Container: context aggregation network. arXiv preprint arXiv:2106.01401 (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Pan, X., et al.: On the integration of self-attention and convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 815–825 (2022)
Google Scholar
Cheng, G., Han, J., Zhou, P., Guo, L.: Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS J. Photogramm. Remote Sens. 98, 119–132 (2014)
Article Google Scholar
Chen, K., et al.: Mmdetection: open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, 11–14 October 2016, Proceedings, Part I 14, pp. 21–37. Springer, Cham (2016)
Google Scholar
Wang, P., Sun, X., Diao, W., Fu, K.: FMSSD: feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 58(5), 3377–3390 (2020)
Article Google Scholar
Ye, X., Xiong, F., Lu, J., Zhou, J., Qian, Y.: F3-net: feature fusion and filtration network for object detection in optical remote sensing images. Remote Sensing 12(24) (2020)
Google Scholar
Li, Y., et al.: A framework of maximum feature exploration oriented remote sensing object detection. IEEE Geosci. Remote Sens. Lett. 20, 1–5 (2023)
Article Google Scholar
Zhong, Y., Han, X., Zhang, L.: Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery. ISPRS J. Photogramm. Remote. Sens. 138, 281–294 (2018)
Article Google Scholar
Chen, J., Wan, L., Zhu, J., Xu, G., Deng, M.: Multi-scale spatial and channel-wise attention for improving object detection in remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 17(4), 681–685 (2020)
Article Google Scholar
Liu, D., Zhang, J., Li, T., Qi, Y., Wu, Y., Zhang, Y.: A lightweight object detection and recognition method based on light global-local module for remote sensing images. IEEE Geosci. Remote Sens. Lett. 20, 1–5 (2023)
Google Scholar
Li, Q., Chen, Y., Zeng, Y.: Transformer with transfer CNN for remote-sensing image object detection. Remote Sensing 14(4) (2022)
Google Scholar
Li, Y., Huang, Q., Pei, X., Jiao, L., Shang, R.: Radet: refine feature pyramid network and multi-layer attention network for arbitrary-oriented object detection of remote sensing images. Remote Sensing 12(3) (2020)
Google Scholar
Zhu, D., et al.: Spatial hierarchy perception and hard samples metric learning for high-resolution remote sensing image object detection. Appl. Intell. 52(3), 3193–3208 (2022)
Article Google Scholar
Zhang, T., Zhang, X., Zhu, P., Jia, X., Tang, X., Jiao, L.: Generalized fewshot object detection in remote sensing images. ISPRS J. Photogramm. Remote Sens. 195, 353–364 (2023)
Article Google Scholar
Yang, Y., et al.: Adaptive knowledge distillation for lightweight remote sensing object detectors optimizing. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)
Google Scholar
Gong, Y., et al.: Context-aware convolutional neural network for object detection in VHR remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 58(1), 34–44 (2020)
Article Google Scholar
Zhang, K., Shen, H.: Multi-stage feature enhancement pyramid network for detecting objects in optical remote sensing images. Remote Sensing 14(3), 579 (2022)
Article Google Scholar
Zhang, G., Lu, S., Zhang, W.: Cad-net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 57(12), 10015–10024 (2019)
Google Scholar
Wang, J., Wang, Y., Wu, Y., Zhang, K., Wang, Q.: Frpnet: a feature-reflowing pyramid network for object detection of remote sensing images. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022)
Google Scholar
Yao, Y., et al.: On improving bounding box representations for oriented object detection. IEEE Trans. Geosci. Remote Sens. 61, 1–11 (2022)
Google Scholar
Huang, Z., Li, W., Xia, X.-G., Wang, H., Jie, F., Tao, R.: Lo-det: lightweight oriented object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)
Google Scholar
Cheng, G., Si, Y., Hong, H., Yao, X., Guo, L.: Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci. Remote Sens. Lett. 18(3), 431–435 (2021)
Article Google Scholar
Li, F., Zhang, H., Liu, S., Guo, J., Ni, L.M., Zhang, L.: Dn-detr: accelerate detr training by introducing query denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13619–13627 (2022)
Google Scholar
Yue, C., Yan, J., Zhang, Y., Luo, Z., Liu, Y., Guo, P.: SCFNET: semantic correction and focus network for remote sensing image object detection. Expert Syst. Appl. 224, 119980 (2023)
Article Google Scholar
Yuan, Z., Liu, Z., Zhu, C., Qi, J., Zhao, D.: Object detection in remote sensing images via multi-feature pyramid network with receptive field block. Remote Sensing 13(5), 862 (2021)
Google Scholar
Wang, G., et al.: FSOD-NET: full-scale object detection from optical remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022)
Google Scholar
Tian, Z., Zhan, R., Hu, J., Wang, W., He, Z., Zhuang, Z.: Generating anchor boxes based on attention mechanism for object detection in remote sensing images. Remote Sensing 12(15), 2416 (2020)
Article Google Scholar
Wang, Y., Xu, C., Liu, C., Li, Z.: Context information refinement for few-shot object detection in remote sensing images. Remote Sensing 14(14), 3255 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

China Electronics Standardization Institute, Beijing, China
Chang Liu, Xiao Qi, Hang Yin, Bowei Song & Ke Li
Nanjing University of Science and Technology, Nanjing, China
Fei Shen

Authors

Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Qi
View author publications
You can also search for this author in PubMed Google Scholar
Hang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Bowei Song
View author publications
You can also search for this author in PubMed Google Scholar
Ke Li
View author publications
You can also search for this author in PubMed Google Scholar
Fei Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chang Liu .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Ningbo, China
De-Shuang Huang
Tianjin University of Science and Technology, Tianjin, China
Chuanlei Zhang
Xiamen University, Xiamen, China
Jiayang Guo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, C., Qi, X., Yin, H., Song, B., Li, K., Shen, F. (2024). Feature Pyramid Full Granularity Attention Network for Object Detection in Remote Sensing Imagery. In: Huang, DS., Zhang, C., Guo, J. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14871. Springer, Singapore. https://doi.org/10.1007/978-981-97-5609-4_26

Download citation

DOI: https://doi.org/10.1007/978-981-97-5609-4_26
Published: 31 July 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5608-7
Online ISBN: 978-981-97-5609-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Feature Pyramid Full Granularity Attention Network for Object Detection in Remote Sensing Imagery

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Self-attention module and FPN-based remote sensing image _target detection

AgBFPN: Attention Guided Bidirectional Feature Pyramid Network for Object Detection

A two-way dense feature pyramid networks for object detection of remote sensing images

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Feature Pyramid Full Granularity Attention Network for Object Detection in Remote Sensing Imagery

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Self-attention module and FPN-based remote sensing image _target detection

AgBFPN: Attention Guided Bidirectional Feature Pyramid Network for Object Detection

A two-way dense feature pyramid networks for object detection of remote sensing images

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation