IDEA-CCNL
/

Taiyi-CLIP-Roberta-102M-Chinese

Feature Extraction

text-classification

Inference Endpoints

Model card Files Files and versions Community

weifeng chen commited on Jul 9, 2022

Commit

5b66f25

·

1 Parent(s): 31cb198

update readme

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ tags:
 # Model Details
-This model is a Chinese CLIP model trained on [Noah-Wukong Dataset](https://wukong-dataset.github.io/wukong-dataset/), which contains about 100M Chinese image-text pairs. We use the image encoder ViT-B-32 from [openAI](https://github.com/openai/CLIP) and the Chinese pre-trained language model from [chinese-roberta-wwm](https://huggingface.co/hfl/chinese-roberta-wwm-ext) via contrastive learning. We freeze the image encoder and only finetune the language model. The model was trained for 20 epochs and it takes about 10 days with 8 A100 GPUs.
 # Taiyi (太乙)
 Taiyi models are a branch of the Fengshenbang (封神榜) series of models. The models in Taiyi are pre-trained with multimodal pre-training strategies. We will release more image-text model trained on Chinese dataset and benefit the Chinese community.

 # Model Details
+This model is a Chinese CLIP model trained on [Noah-Wukong Dataset](https://wukong-dataset.github.io/wukong-dataset/), which contains about 100M Chinese image-text pairs. We use ViT-B-32 from [openAI](https://github.com/openai/CLIP) as image encoder and Chinese pre-trained language model  [chinese-roberta-wwm](https://huggingface.co/hfl/chinese-roberta-wwm-ext) as text encoder. We freeze the image encoder and only finetune the text encoder. The model was trained for 20 epochs and it takes about 10 days with 8 A100 GPUs.
 # Taiyi (太乙)
 Taiyi models are a branch of the Fengshenbang (封神榜) series of models. The models in Taiyi are pre-trained with multimodal pre-training strategies. We will release more image-text model trained on Chinese dataset and benefit the Chinese community.