metadata
title: 'VCR: Visual Caption Restoration'
emoji: π
colorFrom: pink
colorTo: purple
sdk: static
pinned: false
license: cc-by-sa-4.0
short_description: The VCR-Wiki datasets
This space contains all configurations for VCR-Wiki, introduced in VCR: Visual Caption Restoration (https://arxiv.org/abs/2406.06462).
News
- π₯π₯π₯ [2024-06-24] We update our arXiv paper. Now, we have results from Claude 3.5 Sonnet, Claude 3 Opus, GPT-4o, GPT-4-Turbo, Qwen-VL-Max, Reka Core and Gemini-1.5-pro. The evaluation script is also released. Please check github repo:
src/evaluation/closed_source_eval.py
. - π₯π₯π₯ [2024-06-13] We release the evaluation codes for open-source models, closed-source models and the pipeline of creating the dataset in VCR's Github Repo.
- π₯π₯π₯ [2024-06-12] We have incorperated the VCR-wiki evaluation process in lmms-eval framework. Now, users can use one line command to run the evaluation of models on the VCR-wiki test datasets.
- π₯π₯π₯ [2024-06-11] Our paper has been released on the arXiv, including the evaluation results of a series of models.
- π₯π₯π₯ [2024-06-10] We have released the VCR-wiki dataset, which contains 2.11M English and 346K Chinese entities sourced from Wikipedia, offered in both easy and hard variants. The dataset is available in the Hugging Face Datasets library.