sonoisa
/

clip-vit-b-32-japanese-v1

Feature Extraction

sentence-similarity

Inference Endpoints

Model card Files Files and versions Community

日本語版CLIPモデル

This is a CLIP text/image encoder model for Japanese.

英語版CLIPモデルのテキストエンコーダーを一種の蒸留を用いて日本語化したモデルです。作り方や精度、使い方、サンプルコードは下記の解説記事をご参照ください。

解説記事:
- 概要: 【日本語モデル付き】2022年にマルチモーダル処理をする人にお勧めしたい事前学習済みモデル
- 使い方の解説: 【日本語CLIP】画像とテキストの類似度計算、画像やテキストの埋め込み計算、類似画像検索
- (公開準備中) 応用解説: いらすとや画像のマルチモーダル検索（ゼロショット編）
- (公開準備中) 応用解説: いらすとや画像のマルチモーダル検索（ファインチューニング編）
- (公開準備中) 応用解説: 画像とテキストの両方を用いたマルチモーダル分類
サンプルコードのリポジトリ: https://github.com/sonoisa/clip-japanese
デモ:
- いらすとや画像のマルチモーダル検索（ゼロショット）

Downloads last month: 883

Safetensors

Model size

111M params

Tensor type

I64

·

F32

·

Inference Providers NEW

Feature Extraction

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Space using sonoisa/clip-vit-b-32-japanese-v1 1