osunlp
/

UGround-V1-2B

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions Community

BoyuNLP commited on Jan 23

Commit

2a24238

·

verified ·

1 Parent(s): a1dcbe1

Update README.md

Files changed (1) hide show

README.md +9 -12

README.md CHANGED Viewed

@@ -30,28 +30,25 @@ UGround is a strong GUI visual grounding model trained with a simple recipe. Che
   - [UGround-V1-2B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-2B)
   - [UGround-V1-7B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-7B)
   - [UGround-V1-72B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-72B)
-  - [Training Data](https://huggingface.co/osunlp/UGround)
 ## Release Plan
 - [x] [Model Weights](https://huggingface.co/collections/osunlp/uground-677824fc5823d21267bc9812)
   - [x] Initial Version (the one used in the paper)
-  - [x] Qwen2-VL-Based V1
-    - [x] 2B
-    - [x] 7B
-    - [x] 72B
 - [x] Code
-  - [x] Inference Code of UGround (Initial & Qwen2-VL-Based
   - [x] Offline Experiments (Code, Results, and Useful Resources)
-    - [x] ScreenSpot (along with referring expressions generated by GPT-4/4o)
-    - [x] Multimodal-Mind2Web
-    - [x] OmniAct
-    - [x] Android Control
   - [x] Online Experiments
-    - [x] Mind2Web-Live-SeeAct-V
     - [x] [AndroidWorld-SeeAct-V](https://github.com/boyugou/android_world_seeact_v)
   - [ ] Data Synthesis Pipeline (Coming Soon)
-- [x] Training-Data (V1)
 - [x] Online Demo (HF Spaces)

   - [UGround-V1-2B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-2B)
   - [UGround-V1-7B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-7B)
   - [UGround-V1-72B (Qwen2-VL)](https://huggingface.co/osunlp/UGround-V1-72B)
+  - [Training Data](https://huggingface.co/datasets/osunlp/UGround-V1-Data)
 ## Release Plan
 - [x] [Model Weights](https://huggingface.co/collections/osunlp/uground-677824fc5823d21267bc9812)
   - [x] Initial Version (the one used in the paper)
+  - [x] Qwen2-VL-Based V1 (2B, 7B, 72B)
 - [x] Code
+  - [x] [Inference Code of UGround (Initial & Qwen2-VL-Based)](https://github.com/boyugou/llava_uground/)
   - [x] Offline Experiments (Code, Results, and Useful Resources)
+    - [x] [ScreenSpot](https://github.com/OSU-NLP-Group/UGround/tree/main/offline_evaluation/ScreenSpot)
+    - [x] [Multimodal-Mind2Web](https://github.com/OSU-NLP-Group/UGround/tree/main/offline_evaluation/Multimodal-Mind2Web)
+    - [x] [OmniAct](https://github.com/OSU-NLP-Group/UGround/tree/main/offline_evaluation/OmniACT)
+    - [x] [Android Control](https://github.com/OSU-NLP-Group/UGround/tree/main/offline_evaluation/AndroidControl)
   - [x] Online Experiments
+    - [x] [Mind2Web-Live-SeeAct-V](https://github.com/boyugou/Mind2Web_Live_SeeAct_V)
     - [x] [AndroidWorld-SeeAct-V](https://github.com/boyugou/android_world_seeact_v)
   - [ ] Data Synthesis Pipeline (Coming Soon)
+- [x] [Training-Data (V1)](https://huggingface.co/datasets/osunlp/UGround-V1-Data)
 - [x] Online Demo (HF Spaces)