遥感大模型进展与行业应用展望

钱育蓉; 白璐; 刘鹏; 李晨; 杨帆; 李梦倩; 范迎迎; 刘炫辰; 公维军

doi:10.13568/j.cnki.651094.651316.2024.12.18.0001

您当前的位置：

首页 >

文章列表页 >

遥感大模型进展与行业应用展望

更新时间：2026-01-22

- 遥感大模型进展与行业应用展望
- Journal of Xinjiang University (Natural Science Edition in Chinese and English) Vol. 42, Issue 4, Pages: 401-415(2025)
- 作者机构：
  
  1. 新疆大学计算机科学与技术学院
  2. 新疆大学软件学院新疆大数据与智能软件工程研究中心
  3. 怀柔实验室新疆研究院
  4. 新疆财经大学信息管理学院
  5. 河西学院信息技术与传媒学院
- 作者简介：
- 基金信息：
- DOI：10.13568/j.cnki.651094.651316.2024.12.18.0001
  CLC： TP70
- Published：2025
- 稿件说明：
移动端阅览
[1]钱育蓉,白璐,刘鹏,等.遥感大模型进展与行业应用展望[J].新疆大学学报(自然科学版中英文),2025,42(04):401-415.
[1]钱育蓉,白璐,刘鹏,等.遥感大模型进展与行业应用展望[J].新疆大学学报(自然科学版中英文),2025,42(04):401-415. DOI： 10.13568/j.cnki.651094.651316.2024.12.18.0001.

DOI：10.13568/j.cnki.651094.651316.2024.12.18.0001.

摘要

随着人工智能与遥感领域技术的深度融合，遥感大模型逐渐成为当前的研究热点．本文系统梳理了遥感大模型的发展历程和最新进展，分别从模态和任务的角度分类总结了遥感大模型关键技术新动态．从模态角度，遥感大模型在处理海量遥感数据时展现出卓越的能力，可有效挖掘多模态遥感大数据中复杂的空间和光谱信息．从任务角度，遥感大模型正从单任务向多任务处理演变，同时展现出强大的泛化能力，具备迅速适应多模态数据环境下的多样化任务需求特性．首先，归纳总结了单任务/多任务、单模态/多模态遥感大模型学术界热点研究；其次，分类梳理了农业大模型实际应用现状；最后，结合遥感大模型在农业领域泛化能力和可用性等方面关键科学问题进行展望，并重点聚焦于农业知识图谱构建、数据迁移以及轻量化部署三个方面的分析．

Abstract

As artificial intelligence and remote sensing technologies converge

foundation models in remote sensing have emerged as a major research focus. This paper systematically reviews the evolution and recent breakthroughs in remote sensing foundation models

classifying and analyzing key techniques along modality and task dimensions. From the modality perspective

remote sensing foundation models excel in processing large-scale remote sensing data

efficiently extracting intricate spatial-spectral features from multimodal data streams. Task-wise

remote sensing foundation models are transitioning from single-task to multi-task paradigms

demonstrating robust generalization that enables rapid adaptation to varied tasks in multi-modal data contexts. Firstly

this paper systematically reviews cutting-edge research on single/multi-task and single/multi-modal remote sensing foundation models

then examines agricultural foundation model implementations

and finally forecasts critical scientific challenges in enhancing the generalization and applicability of remote sensing foundation models for agricultural applications. Furthermore

this study focuses on the analysis of agricultural knowledge graph construction

data migration

and lightweight deployment.

关键词

Keywords

references

WU T Y,HE S Z,LIU J P,et al.A brief overview of Chat GPT:The history,status quo and potential future development[J].CAA Journal of Automatica Sinica ,2023,10(5):1122-1136.

张良培，张乐飞，袁强强．遥感大模型：进展与前瞻[J]．武汉大学学报(信息科学版)，2023,48(10):1574-1581.ZHANG L P,ZHANG L F,YUAN Q Q.Large remote sensing model:Progress and prospects[J].Geomatics and Information Science of Wuhan University,2023,48(10):1574-1581.(in Chinese)

DING L,ZHU K,PENG D F,et al.Adapting segment anything model for change detection in VHR remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-11.

ZHANG S X,SONG F,LIU X Y,et al.Text semantic fusion relation graph reasoning for few-shot object detection on remote sensing images[J].Remote Sensing,2023,15(5):1187.

WANG D,ZHANG J,XU M Q,et al.MTP:Advancing remote sensing foundation model via multitask pretraining[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2024,17:11632-11654.

HONG D F,ZHANG B,LI X Y,et al.SpectralGPT:Spectral remote sensing foundation model[J].IEEE Transactions on Pattern Analysis and Machine Intelligence ,2024,46(8):5227-5244.

LIU F,CHEN D L,GUAN Z,et al.RemoteCLIP:A vision language foundation model for remote sensing[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-16.

MUHTAR D,LI Z S,GU F,et al.LHRS-Bot:Empowering remote sensing with VGI-enhanced large multimodal language model[C]//Computer Vision-ECCV 2024.Cham:Springer,2024:440-457.

SUMBUL G,CINBIS R G,AKSOY S.Fine-grained object recognition and zero-shot learning in remote sensing imagery[J].IEEE Transactions on Geoscience and Remote Sensing ,2018,56(2):770-779.

JIAO L C,ZHANG F,LIU F,et al.A survey of deep learning-based object detection[J].IEEE Access,2019,7:128837-128868.

CHEN Y D,WEI C C,WANG D L,et al.Semi-supervised contrastive learning for few-shot segmentation of remote sensing images[J].Remote Sensing,2022,14(17):4254.

KIRILLOV A,MINTUN E,RAVI N,et al.Segment anything[C]//2023 IEEE/CVF International Conference on Computer Vision(ICCV).October 1-6,2023.Paris,France.IEEE,2023:3992-4003.

LI A X,LU Z W,WANG L W,et al.Zero-shot scene classification for high spatial resolution remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing ,2017,55(7):4157-4167.

NOMAN M,FIAZ M,CHOLAKKAL H,et al.Remote sensing change detection with Transformers trained from scratch[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-14.

JIANG X F,ZHOU N,LI X.Few-shot segmentation of remote sensing images using deep metric learning[J].IEEE Geoscience and Remote Sensing Letters,2022,19:1-5.

WANG C,PENG G H,DE BAETS B.A distance-constrained semantic autoencoder for zero-shot remote sensing scene classification[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2021,14:12545-12556.

CHEN C,MA H X,YAO G R,et al.Remote sensing image augmentation based on text description for waterside change detection[J].Remote Sensing,2021,13(10):1894.

DENG J J,YANG Z Y,CHEN T L,et al.TransVG:End-to-end visual grounding with Transformers[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).October 10-17,2021.Montreal,QC,Canada.IEEE,2021:1749-1759.

LU X N,SUN X,DIAO W H,et al.Few-shot object detection in aerial imagery guided by text-modal knowledge[J].IEEE Transactions on Geoscience and Remote Sensing ,2023,61:1-19.

BAZI Y,AL RAHHAL M M,MEKHALFI M L,et al.Bi-modal Transformer-based approach for visual question answering in remote sensing imagery[J].IEEE Transactions on Geoscience and Remote Sensing ,2022,60:1-11.

ZHANG Y,YANG Q.A survey on multi-task learning[J].IEEE Transactions on Knowledge and Data Engineering ,2022,34(12):5586-5609.

PETERSSON H,GUSTAFSSON D,BERGSTROM D.Hyperspectral image analysis using deep learning:A review[C]//2016Sixth International Conference on Image Processing Theory ,Tools and Applications(IPTA).December 12-15,2016.Oulu,Finland.IEEE,2016:1-6.

HAN K,WANG Y H,CHEN H T,et al.A survey on vision Transformer[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(1):87-110.

HE K M,CHEN X L,XIE S N,et al.Masked autoencoders are scalable vision learners[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June 18-24,2022.New Orleans,LA,USA.IEEE,2022:15979-15988.

SUN X,WANG P J,LU W X,et al.RingMo:A remote sensing foundation model with masked image modeling[J].IEEE Transactions on Geoscience and Remote Sensing ,2023,61:1-22.

RANI V,NABI S T,KUMAR M,et al.Self-supervised learning:A succinct review[J].Archives of Computational Methods in Engineering,2023,30(4):2761-2775.

WANG D,ZHANG J,DU B,et al.An empirical study of remote sensing pretraining[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-20.

WANG D,ZHANG Q M,XU Y F,et al.Advancing plain vision Transformer toward remote sensing foundation model[J].IEEE Transactions on Geoscience and Remote Sensing ,2023,61:1-15.

LI J X,HONG D F,GAO L R,et al.Deep learning in multimodal remote sensing data fusion:A comprehensive review[J].International Journal of Applied Earth Observation and Geoinformation,2022,112:102926.

ZHANG W,CAI M X,ZHANG T,et al.EarthGPT:A universal multimodal large language model for multisensor image comprehension in remote sensing domain[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-20.

JIANG Q P,KANG Y Z,WANG Z H,et al.Perception-driven deep underwater image enhancement without paired supervision[J].IEEE Transactions on Multimedia,2024,26:4884-4897.

JIANG D,YE M.Cross-modal implicit relation reasoning and aligning for text-to-image person retrieval[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June 17-24,2023.Vancouver,BC,Canada.IEEE,2023:2787-2797.

HU Y,YUAN J L,WEN C C,et al.RSGPT:A remote sensing vision language model and benchmark[J].ISPRS Journal of Photogrammetry and Remote Sensing ,2025,224:272-286.

GUO X,LAO J W,DANG B,et al.SkySense:A multi-modal remote sensing foundation model towards universal interpretation for earth observation imagery[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June16-22,2024.Seattle,WA,USA.IEEE,2024:27662-27673.

TZACHOR A,DEVARE M,RICHARDS C,et al.Large language models and agricultural extension services[J].Nature Food,2023,4(11):941-948.

黄勃，吴申奥，王文广，等．图模互补：知识图谱与大模型融合综述[J]．武汉大学学报(理学版)，2024,70(4):397-412.HUANG B,WU S A,WANG W G,et al.KG-LLM-MCom:A survey on integration of knowledge graph and large language model[J].Journal of Wuhan University(Natural Science Edition),2024,70(4):397-412.(in Chinese)

张兵，杨晓梅，高连如，等．遥感大数据智能解译的地理学认知模型与方法[J]．测绘学报，2022,51(7):1398-1415.ZHANG B,YANG X M,GAO L R,et al.Geo-cognitive models and methods for intelligent interpretation of remotely sensed big data[J].Acta Geodaetica et Cartographica Sinica ,2022,51(7):1398-1415.(in Chinese)

ZHANG X R,HONG W H,LI Z Y,et al.Hierarchical knowledge graph for multilabel classification of remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-14.

LI Y S,KONG D Y,ZHANG Y J,et al.Robust deep alignment network with remote sensing knowledge graph for zeroshot and generalized zero-shot remote sensing image scene classification[J].ISPRS Journal of Photogrammetry and Remote Sensing,2021,179:145-158.

李彦胜，武康，欧阳松，等．地学知识图谱引导的遥感影像语义分割[J]．遥感学报，2024,28(2):455-469.LI Y S,WU K,OUYANG S,et al.Geographic knowledge graph-guided remote sensing image semantic segmentation[J].National Remote Sensing Bulletin,2024,28(2):455-469.(in Chinese)

李彦胜，吴敏郎，张永军．知识图谱约束深度网络的高分辨率遥感影像场景分类[J]．测绘学报，2024,53(4):677-688.LI Y S,WU M L,ZHANG Y J.Knowledge graph-guided deep network for high-resolution remote sensing image scene classification[J].Acta Geodaetica et Cartographica Sinica ,2024,53(4):677-688.(in Chinese)

YU B X B,CHANG J L,WANG H X,et al.Visual tuning[J].ACM Computing Surveys,2024,56(12):1-38.

LU X Y,WENG Q H.Multi-LoRA fine-tuned segment anything model for urban man-made object extraction[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-19.

WANG M M,ZHOU L,ZHANG K Y,et al.ESAM-CD:Fine-tuned EfficientSAM network with LoRA for weakly supervised remote sensing image change detection[J].IEEE Transactions on Geoscience and Remote Sensing ,2024,62:1-16.

AGIZA A,NESEEM M,REDA S.MTLoRA:A low-rank adaptation approach for efficient multi-task learning[C]//2024IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June 16-22,2024.Seattle,WA,USA.IEEE,2024:16196-16205.

SHAO H,LIU B,QIAN Y M.One-shot sensitivity-aware mixed sparsity pruning for large language models[C]//ICASSP 2024-2024 IEEE International Conference on Acoustics ,Speech and Signal Processing(ICASSP).April 14-19,2024.Seoul,Republic of Korea.IEEE,2024:11296-11300.

LIN J,TANG J M,TANG H T,et al.AWQ:Activation-aware weight quantization for on-device LLM compression and acceleration[J].GetMobile:Mobile Computing and Communications ,2025,28(4):12-17.

WANG J B,CHEN Y M,ZHENG Z H,et al.CrossKD:Cross-head knowledge distillation for object detection[C]//2024IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June 16-22,2024.Seattle,WA,USA.IEEE,2024:16520-16530.

MALIHI L,HEIDEMANN G.Efficient and controllable model compression through sequential knowledge distillation and pruning[J].Big Data and Cognitive Computing ,2023,7(3):154.

HUANG Y C,LI W H,TSOU C H,et al.UP-NAS:Unified proxy for neural architecture search[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).June 17-18,2024.Seattle,WA,USA.IEEE,2024:1675-1684.

Views

125

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

⁰