测绘通报 ›› 2026, Vol. 0 ›› Issue (2): 126-130.doi: 10.13474/j.cnki.11-2246.2026.0220

• 学术研究 • 上一篇    下一篇

应用大语言模型的无人机精确着落框架

陈立军1, 陈青2   

  1. 1. 广州软件学院软件与人工智能学院, 广东 广州 510990;
    2. 百色学院工商管理学院, 广西 百色 533000
  • 收稿日期:2025-05-22 发布日期:2026-03-12
  • 作者简介:陈立军(1974—),男,硕士,副教授,主要研究方向为人工智能。E-mail:372158286@qq.com
  • 基金资助:
    2025年度广州软件学院科研项目(KY202504)

A drone precision landing framework using large language models

CHEN Lijun1, CHEN Qing2   

  1. 1. School of Software and Artificial Intelligence, Guangzhou University of Software, Guangzhou 510990, China;
    2. School of Business Administration, Baise University, Baise 533000, China
  • Received:2025-05-22 Published:2026-03-12

摘要: 在动态、非结构化的环境中,无人机语义感知能力有限,且依赖于固定的、对上下文不敏感的安全系数,传统着落方法往往存在不足。为了解决这些问题,本文提出了一种混合框架LLM_Land,将大语言模型(LLM)与模型预测控制(MPC)相结合,从视觉语言编码器(VLE)(如BLIP)开始,将实时图像转换为简洁的文本场景描述。由配备检索增强生成(RAG)的轻量级LLM(如 Qwen 2.5 1.5B 或 LLaMA 3.2 1B)处理,以对场景元素进行分类,并推断出上下文感知安全缓冲区,如行人3 m,车辆5 m,由此产生的语义标志和不安全区域随后被输入MPC模块,从而实现实时轨迹重新规划,在保持高着陆精度的同时避免碰撞。在ROS-Gazebo模拟器中验证了本文框架,其性能始终优于基于视觉的常规MPC基线。结果表明,动态障碍物导致的近失事故大幅减少,同时在杂乱环境中保持了精确着陆。

关键词: 无人机语义感知, 自主着陆, 大型语言模型, 模型预测控制

Abstract: Traditional landing methods often fall short in response to the limited semantic sensing capabilities of UAVs in dynamic,unstructured environments and their reliance on fixed,context-insensitive safety factors.To address these limitations,a hybrid framework,LLM_Land,is proposed that combines a large language model (LLM)with model predictive control (MPC),starting from a visual language encoder (VLE)(e.g.,BLIP),which converts real-time images into succinct textual scene descriptions,which is used by a retrieval-augmented generation (RAG)-equipped lightweight LLM (e.g.,Qwen 2.5 1.5B or LLaMA 3.2 1B)to categorise scene elements and infer context-aware safety buffers,e.g.,3 m for pedestrians and 5 m for vehicles,and the resulting semantic flags and unsafe zones are subsequently fed into the MPC module,enabling real-time trajectory replanning for collision avoidance whilst maintaining a high level of landing accuracy.The proposed framework is validated in the ROS-Gazebo simulator,which consistently outperformed the conventional vision-based MPC baseline,and the results showed a significant reduction in near-miss accidents due to dynamic obstacles,while maintaining accurate landings in a cluttered environment.

Key words: UAV semantic perception, autonomous landing, large-scale language modelling, model predictive control

中图分类号: