Spaces:

rbler
/

MMScan-HVG-Challenge

Sleeping

App Files Files Community

MMScan-HVG-Challenge / gradio_show.md

rbler

Update gradio_show.md

68fa834 verified 3 months ago

preview code

raw

history blame contribute delete

3.31 kB

	# 📘 MMScan Hierarchical VIsual Grounding Challenge

	![My Result](MMScan_teaser.png)
	## 🔍 Challenge Introduction

	Hierarchical Visual Grounding (HVG) Task in the MMScan Benchmark:
	This task evaluates a model’s ability to perform visual grounding at multiple levels of granularity — from region to object-level, and from single-target localization to inter-targets localization. Given a natural language description, models are expected to accurately locate the corresponding object(s) within the 3D scenes, reflecting comprehensive spatial and attribute-level understanding.

	- Overview: You can refer to this [website](https://neurips.cc/virtual/2024/poster/97429) for an overview and our [paper](https://arxiv.org/abs/2406.09401) for more details.
	- Challenge Data and Codebase: The challenge dataset includes:
	- Training set: Language prompts + ground-truth bounding boxes
	- Validation set: Language prompts + ground-truth bounding boxes
	- Test set: Language prompts only (no ground truth provided)
	Follow the [instructions](https://github.com/OpenRobotLab/EmbodiedScan/tree/mmscan) to get familiar with data organization and MMScan APIs. All the code for MMScan is available [here](https://github.com/OpenRobotLab/EmbodiedScan/tree/mmscan).

	- Evaluation Metrics: For the visual grounding task, our evaluator computes multiple metrics including [email protected] (Average Precision), [email protected], and [email protected] where the gTop-k metric is an expanded metric that generalizes the traditional Top-k metric, offering superior flexibility and interpretability compared to traditional ones when oriented towards multi-target grounding.

	- Contact: For any questions related to the HVG challenge, feel free to reach out to [Jingli Lin](linjingli166@gmail).

	---
	## 📝 How to Participate

	To register for the challenge, please contact us via [Google Mail](linjingli166@gmail) and include the following information:

	- A self-chosen username (this will be shown on the leaderboard)
	- A login password
	- Your team or institution name
	- A brief statement on your motivation for participating

	> 📌 Submission limit: Each user is allowed a maximum of 5 submissions per day.
	---
	## 🚀 Submission Guidelines

	- Your submission should be a dictionary, where each key is a sample ID from the test split.
	- For each sample, provide:
	- `pred_bboxes`: a list of predicted bounding boxes
	- `scores`: the corresponding confidence scores
	- An expected result is:

	```python
	{
	'VG_Inter_Space_OO__1mp3d_0009_region0__55'(sample ID):
	{
	'pred_bboxes'(list, 100*9): [[...],...]
	'scores'(list, 100): [...]
	}
	...
	}
	```

	> 💡 Note: The bounding boxes do not need to be sorted by confidence.

	- ⛔ Limit the number of predicted boxes to 100 per sample. If your submission contains more than 100 boxes for a single sample, only the top 100 will be considered.

	- ⏱️ Efficiency Tip: Round all floating-point numbers in your submission to two decimal places to reduce file size and transmission overhead. (To ensure fairness during evaluation, all decimal numbers in the submitted predictions will be rounded to two decimal places.)