Update README.md

Browse files

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ datasets:
 `CodeModernBERT-Owl-2.0-Pre` は、マルチリンガルなコード理解・検索に対応した **CodeModernBERT-Owl** 系列の最新事前学習モデルです。
 本モデルは、**CodeBERT（Feng et al., 2020）で使用されたバイモーダル学習データの約4倍** に相当する、**全て独自収集・構築した高品質なコーパス**のみに基づいて事前学習を行っています。
-前バージョン（`CodeModernBERT-Owl-1.0`）と比較しても、**約2倍のデータ量**で再学習されており、よりリッチな構文・意味情報を学習しています。
 今回新たに、これまで対応していた **7言語（Python, Java, JavaScript, PHP, Ruby, Go, Rust）に加えて、TypeScript** を新たにコーパスに加え、より幅広いコード言語に対応しました。
@@ -69,7 +69,7 @@ datasets:
 `CodeModernBERT-Owl-2.0-Pre` is the latest pretrained model in the **CodeModernBERT-Owl** series for multilingual code understanding and retrieval.
 This model was trained **entirely on a custom-built high-quality corpus**, approximately **4 times larger than the bimodal dataset used in CodeBERT (Feng et al., 2020)**.
-Compared to the previous version (`CodeModernBERT-Owl-1.0`), it has been retrained on **twice the amount of data**, capturing more structural and semantic patterns.
 I also newly added **TypeScript** to the previously supported **7 languages** (Python, Java, JavaScript, PHP, Ruby, Go, Rust), further broadening the model’s applicability.

 `CodeModernBERT-Owl-2.0-Pre` は、マルチリンガルなコード理解・検索に対応した **CodeModernBERT-Owl** 系列の最新事前学習モデルです。
 本モデルは、**CodeBERT（Feng et al., 2020）で使用されたバイモーダル学習データの約4倍** に相当する、**全て独自収集・構築した高品質なコーパス**のみに基づいて事前学習を行っています。
+前バージョン（`CodeModernBERT-Owl-1.0`）と比較しても、**約2倍のデータ量**で学習されており、よりリッチな構文・意味情報を学習しています。
 今回新たに、これまで対応していた **7言語（Python, Java, JavaScript, PHP, Ruby, Go, Rust）に加えて、TypeScript** を新たにコーパスに加え、より幅広いコード言語に対応しました。
 `CodeModernBERT-Owl-2.0-Pre` is the latest pretrained model in the **CodeModernBERT-Owl** series for multilingual code understanding and retrieval.
 This model was trained **entirely on a custom-built high-quality corpus**, approximately **4 times larger than the bimodal dataset used in CodeBERT (Feng et al., 2020)**.
+Compared to the previous version (`CodeModernBERT-Owl-1.0`), it has been trained on **twice the amount of data**, capturing more structural and semantic patterns.
 I also newly added **TypeScript** to the previously supported **7 languages** (Python, Java, JavaScript, PHP, Ruby, Go, Rust), further broadening the model’s applicability.