Spaces:
Sleeping
Sleeping
# π MMScan Hierarchical VIsual Grounding Challenge | |
 | |
## π Challenge Introduction | |
**Hierarchical Visual Grounding (HVG) Task in the MMScan Benchmark**: | |
This task evaluates a modelβs ability to perform visual grounding at multiple levels of granularity β from region to object-level, and from single-target localization to inter-targets localization. Given a natural language description, models are expected to accurately locate the corresponding object(s) within the 3D scenes, reflecting comprehensive spatial and attribute-level understanding. | |
- **Overview**: You can refer to this [website](https://neurips.cc/virtual/2024/poster/97429) for an overview and our [paper](https://arxiv.org/abs/2406.09401) for more details. | |
- **Challenge Data and Codebase**: The challenge dataset includes: | |
- **Training set**: Language prompts + ground-truth bounding boxes | |
- **Validation set**: Language prompts + ground-truth bounding boxes | |
- **Test set**: Language prompts only (no ground truth provided) | |
Follow the [instructions](https://github.com/OpenRobotLab/EmbodiedScan/tree/mmscan) to get familiar with data organization and MMScan APIs. All the code for MMScan is available [here](https://github.com/OpenRobotLab/EmbodiedScan/tree/mmscan). | |
- **Evaluation Metrics**: For the visual grounding task, our evaluator computes multiple metrics including [email protected] (Average Precision), [email protected], and [email protected] where the gTop-k metric is an expanded metric that generalizes the traditional Top-k metric, offering superior flexibility and interpretability compared to traditional ones when oriented towards multi-target grounding. | |
- **Contact**: For any questions related to the HVG challenge, feel free to reach out to [**Jingli Lin**](linjingli166@gmail). | |
--- | |
## π How to Participate | |
To register for the challenge, please contact us via [**Google Mail**](linjingli166@gmail) and include the following information: | |
- A **self-chosen username** (this will be shown on the leaderboard) | |
- A **login password** | |
- Your **team or institution name** | |
- A brief statement on your **motivation for participating** | |
> π **Submission limit**: Each user is allowed a **maximum of 5 submissions per day**. | |
--- | |
## π Submission Guidelines | |
- Your submission should be a **dictionary**, where each key is a **sample ID** from the test split. | |
- For each sample, provide: | |
- `pred_bboxes`: a list of predicted bounding boxes | |
- `scores`: the corresponding confidence scores | |
- An expected result is: | |
```python | |
{ | |
'VG_Inter_Space_OO__1mp3d_0009_region0__55'(sample ID): | |
{ | |
'pred_bboxes'(list, 100*9): [[...],...] | |
'scores'(list, 100): [...] | |
} | |
... | |
} | |
``` | |
> π‘ **Note**: The bounding boxes do **not** need to be sorted by confidence. | |
- β **Limit the number of predicted boxes to 100 per sample**. If your submission contains more than 100 boxes for a single sample, only the top 100 will be considered. | |
- β±οΈ **Efficiency Tip**: Round all floating-point numbers in your submission to **two decimal places** to reduce file size and transmission overhead. (To ensure fairness during evaluation, all decimal numbers in the submitted predictions will be rounded to two decimal places.) | |