File size: 2,141 Bytes
d933a90
 
 
 
 
 
 
d0c48c9
 
 
 
 
 
 
 
 
 
 
 
763a4ef
 
 
d0c48c9
 
 
763a4ef
 
 
d0c48c9
 
 
 
 
 
 
 
 
 
 
 
 
d2f0312
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
license: apache-2.0
pipeline_tag: image-text-to-text
---

**<center><span style="font-size:2em;">TinyLLaVA-Video</span></center>**

[![arXiv](https://img.shields.io/badge/Arxiv-2402.14289-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2501.15513)[![Github](https://img.shields.io/badge/Github-Github-blue.svg)](https://github.com/ZhangXJ199/TinyLLaVA-Video)

For training data, We combine partial data from two datasets: [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K) and [
Valley](https://github.com/RupertLuo/Valley). 

|   Stage  |            Source             |    #Sample    |
|----------| :---------------------------: | :-----------: |
| Pretrain |   LLaVA-Video-178K + Valley   |     397k      |
| Finetune |       LLaVA-Video-178K        |     491k      |

#### Pretrain Data

We use four subsets of [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K): ``0_30_s_academic_v0_1``, ``30_60_s_academic_v0_1``, ``0_30_s_youtube_v0_1``, and ``30_60_s_youtube_v0_1``, supplemented with the filtered [Video-LLaVA](https://huggingface.co/datasets/LanguageBind/Video-LLaVA).

We provide cleaned annotations data, and the video data can be downloaded from [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K) and [Video-LLaVA](https://huggingface.co/datasets/LanguageBind/Video-LLaVA).

#### Finetune Data

We use four subsets of [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K): ``0_30_s_academic_v0_1``, ``30_60_s_academic_v0_1``, ``0_30_s_youtube_v0_1``, and ``30_60_s_youtube_v0_1``. 

We provide cleaned annotations data, and the video data can be downloaded from [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K).

#### Organize Data

Organize the image files and annotation files as follows in ``path/to/your/dataset``:

```Shell
dataset
β”œβ”€β”€ academic_source
β”œβ”€β”€ liwei_youtube_videos
β”œβ”€β”€ valley
β”œβ”€β”€ text_files
β”‚   β”œβ”€β”€ cleaned_video_caption.json
β”‚   β”œβ”€β”€ cleaned_video_openqa.json
```

**Note: If there is any infringement, please contact us for removal.**