arxiv:2407.08726

Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

Published on Jul 11

· Submitted by

NikV09 on Jul 12

Upvote

Authors:

Cherie Ho ,

Jiaye Zou ,

Omar Alama ,

Benjamin Chiang ,

Taneesh Gupta ,

Chen Wang ,

Nikhil Keetha ,

Abstract

Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms, Mapillary for FPV images and OpenStreetMap for BEV semantic maps. We introduce Map It Anywhere (MIA), a data engine that enables seamless curation and modeling of labeled map prediction data from existing open-source map platforms. Using our MIA data engine, we display the ease of automatically collecting a dataset of 1.2 million pairs of FPV images & BEV maps encompassing diverse geographies, landscapes, environmental factors, camera models & capture scenarios. We further train a simple camera model-agnostic model on this data for BEV map prediction. Extensive evaluations using established benchmarks and our dataset show that the data curated by MIA enables effective pretraining for generalizable BEV map prediction, with zero-shot performance far exceeding baselines trained on existing datasets by 35%. Our analysis highlights the promise of using large-scale public maps for developing & testing generalizable BEV perception, paving the way for more robust autonomous navigation.

View arXiv page View PDF Add to collection

Community

NikV09

Paper author Paper submitter Jul 12

How to Map 🗺️It Anywhere?

The trick is to flip the script from using principally limited self-collected datasets to using readily available worldwide maps.

Head over to our website to generate your own data or see our FPV Mapper in action!

mapitanywhere.github.io

nielsr

Jul 15

Hi @NikV09 congrats on this work!

Are you planning to share the dataset (https://github.com/MapItAnywhere/MapItAnywhere/blob/main/mia/dataset.md) on the hub, which would allow people to load it in 2 lines of code? So that people could do:

from datasets import load_dataset

dataset = load_dataset("NikV09/map-it-anywhere")

Here's a guide on how to do that: https://huggingface.co/docs/datasets/loading. It could then also be linked to this paper, as explained here.

The dataset viewer would then automatically show the images in the browser.

Let me know if this interests you, or you need any help!

Cheers,

Niels

librarian-bot

Jul 16

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.08726 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.08726 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.08726 in a Space README.md to link it from this page.