Question Answering
Transformers
English
Chinese
multimodal
vqa
text
audio
Eval Results
Inference Endpoints
zeroMN commited on
Commit
1572e58
·
verified ·
1 Parent(s): 658ed05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -33
README.md CHANGED
@@ -1,40 +1,77 @@
1
- ---
2
- language:
3
- - en
4
- - zh
5
- license: apache-2.0
6
- library_name: pytorch
7
- tags:
8
- - multimodal
9
- - vqa
10
- - text
11
- - audio
 
 
 
 
 
12
  datasets:
13
- - synthetic-dataset
14
  metrics:
15
- - accuracy
16
- - bleu
17
- - wer
18
- model-index:
19
- - name: AutoModel
20
- results:
21
- - task:
22
- type: vqa
23
- name: Visual Question Answering
24
- dataset:
25
- type: synthetic-dataset
26
- name: Synthetic Multimodal Dataset
27
- split: test
28
- metrics:
29
- - type: accuracy
30
- value: 85
31
- ---
32
-
33
- # Model Card for AutoModel
34
- AutoModel 是一个多模态模型,支持图像、文本和语音输入...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
 
37
- ---
38
 
39
  ### **3. 提供可下载文件**
40
  确保以下文件已上传到仓库,便于用户下载和运行:
 
1
+ ## 模型卡
2
+ ---------------------------------------------------------------------
3
+ metadata:
4
+ language: multilingual # AutoModel 是一个支持多语言处理的多模态模型
5
+ license:
6
+ - apache-2.0
7
+ - MIT # Apache 2.0 和 MIT 是开源许可
8
+ library_name: pytorch # 该模型基于 PyTorch 构建
9
+ tags:
10
+ - multimodal # 该模型是多模态模型
11
+ - image # 处理图像任务
12
+ - text # 处理文本任务
13
+ - audio # 处理语音任务
14
+ - vqa # 支持视觉问答任务
15
+ - automatspeerecognition # 支持自动语音识别任务
16
+ - retrieval # 支持信息检索任务
17
  datasets:
18
+ - synthetdataset # 训练和验证使用了合成的多模态数据集
19
  metrics:
20
+ - accuracy # 视觉问答任务的准确率
21
+ - bleu # 生成式任务(如字幕生成)的 BLEU 指标
22
+ - wer # 语音识别任务的 WER(Word Error Rate)
23
+ base_model: None # 该模型为独立设计,没有基于预训练模型
24
+ widget:
25
+ - text: "A cat playing with a ball"
26
+ example_title: "Cat"
27
+ - text: "A dog jumping over a fence"
28
+ example_title: "Dog"
29
+
30
+ model_index:
31
+ - name: AutoModel
32
+ results:
33
+ - task:
34
+ type: vqa # 支持视觉问答任务
35
+ name: Visual Question Answering
36
+ dataset:
37
+ type: synthetdataset
38
+ name: Synthetic Multimodal Dataset
39
+ config: default
40
+ split: test
41
+ revision: main
42
+ metrics:
43
+ - type: accuracy
44
+ value: 85.0
45
+ name: VQA Accuracy
46
+ - task:
47
+ type: automatspeerecognition
48
+ name: Automatic Speech Recognition
49
+ dataset:
50
+ type: synthetdataset
51
+ name: Synthetic Multimodal Dataset
52
+ config: default
53
+ split: test
54
+ revision: main
55
+ metrics:
56
+ - type: wer
57
+ value: 15.3
58
+ name: Test WER
59
+ - task:
60
+ type: captioning
61
+ name: Image Captioning
62
+ dataset:
63
+ type: synthetdataset
64
+ name: Synthetic Multimodal Dataset
65
+ config: default
66
+ split: test
67
+ revision: main
68
+ metrics:
69
+ - type: bleu
70
+ value: 27.5
71
+ name: BL4
72
+ -----------------------------------------------------------
73
 
74
 
 
75
 
76
  ### **3. 提供可下载文件**
77
  确保以下文件已上传到仓库,便于用户下载和运行: