metadata
license: mit
datasets:
- Qingyun/lmmrotate-sft-data
language:
- en
base_model:
- microsoft/Florence-2-large
pipeline_tag: image-text-to-text
tags:
- aerial
- geoscience
- remotesensing
LMMRotate ๐ฎ: A Simple Aerial Detection Baseline of Multimodal Language Models
Qingyun Liโ Yushi Chenโ Xinya Shuโ Dong Chenโ Xin Heโ Yi Yuโ Xue Yangโ
If you find our work helpful, please consider giving us a โญ!
- ArXiv Paper: https://arxiv.org/abs/2501.09720
- GitHub Repo: https://github.com/Li-Qingyun/mllm-mmrotate
- HuggingFace Page: https://huggingface.co/collections/Qingyun/lmmrotate-6780cabaf49c4e705023b8df
This repo hosts all the available checkpoints of Florence-2 trained for aerial detection with LMMRotate in our paper.
LMMRotate is a technical practice to fine-tune Large Multimodal language Models for oriented object detection as in MMRotate and hosts the official implementation of the paper: A Simple Aerial Detection Baseline of Multimodal Language Models.