x2bee
/

KoModernBERT-base-mlm_v02

@@ -1,65 +1,99 @@
 ---
 library_name: transformers
-base_model: CocoRoF/KoModernBERT-base-mlm-v04-retry-model-chp19
-tags:
-- generated_from_trainer
 model-index:
-- name: KoModernBERT-base-mlm-v04-retry-model-chp20
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# KoModernBERT-base-mlm-v04-retry-model-chp20
-This model is a fine-tuned version of [CocoRoF/KoModernBERT-base-mlm-v04-retry-model-chp19](https://huggingface.co/CocoRoF/KoModernBERT-base-mlm-v04-retry-model-chp19) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 2.0527
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-06
-- train_batch_size: 4
-- eval_batch_size: 4
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 64
-- total_train_batch_size: 2048
-- total_eval_batch_size: 32
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- num_epochs: 1.0
-### Training results
-| Training Loss | Epoch  | Step  | Validation Loss |
-|:-------------:|:------:|:-----:|:---------------:|
-| 132.9037      | 0.2001 | 2500  | 2.0796          |
-| 133.5153      | 0.4002 | 5000  | 2.0743          |
-| 131.6747      | 0.6002 | 7500  | 2.0601          |
-| 131.0512      | 0.8003 | 10000 | 2.0527          |
 ### Framework versions
-- Transformers 4.48.3
 - Pytorch 2.5.1+cu124
 - Datasets 3.2.0
-- Tokenizers 0.21.0

 ---
 library_name: transformers
+license: apache-2.0
+base_model: answerdotai/ModernBERT-base
 model-index:
+- name: x2bee/KoModernBERT-base-mlm
   results: []
+language:
+- ko
 ---
+# KoModernBERT-base-v02
+This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <br>
+* Flash-Attention 2
+* StabelAdamW
+* Unpadding & Sequence Packing
+## Example Use
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+from huggingface_hub import HfApi, login
+with open('./api_key/HGF_TOKEN.txt', 'r') as hgf:
+    login(token=hgf.read())
+api = HfApi()
+model_id = "x2bee/KoModernBERT-base-mlm-v01"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForMaskedLM.from_pretrained(model_id).to("cuda")
+def modern_bert_convert_with_multiple_masks(text: str, top_k: int = 1, select_method:str = "Logit") -> str:
+    if "[MASK]" not in text:
+        raise ValueError("MLM Model should include '[MASK]' in the sentence")
+    while "[MASK]" in text:
+        inputs = tokenizer(text, return_tensors="pt").to("cuda")
+        outputs = model(**inputs)
+        input_ids = inputs["input_ids"][0].tolist()
+        mask_indices = [i for i, token_id in enumerate(input_ids) if token_id == tokenizer.mask_token_id]
+        current_mask_index = mask_indices[0]
+        logits = outputs.logits[0, current_mask_index]
+        top_k_tokens = logits.topk(top_k).indices.tolist()
+        top_k_logits, top_k_indices = logits.topk(top_k)
+        if select_method == "Logit":
+            probabilities = torch.softmax(top_k_logits, dim=0).tolist()
+            predicted_token_id = random.choices(top_k_indices.tolist(), weights=probabilities, k=1)[0]
+            predicted_token = tokenizer.decode([predicted_token_id]).strip()
+        elif select_method == "Random":
+            predicted_token_id = random.choice(top_k_tokens)
+            predicted_token = tokenizer.decode([predicted_token_id]).strip()
+        elif select_method == "Best":
+            predicted_token_id = top_k_tokens[0]
+            predicted_token = tokenizer.decode([predicted_token_id]).strip()
+        else:
+            raise ValueError("select_method should be one of ['Logit', 'Random', 'Best']")
+        text = text.replace("[MASK]", predicted_token, 1)
+        print(f"Predicted: {predicted_token} | Current text: {text}")
+    return text
+```
+```
+text = "30일 전남 무안국제[MASK] 활주로에 전날 발생한 제주항공 [MASK] 당시 기체가 [MASK]착륙하면서 강한 마찰로 생긴 흔적이 남아 있다. 이 참사로 [MASK]과 승무원 181명 중 179명이 숨지고 [MASK]는 형체를 알아볼 수 없이 [MASK]됐다. [MASK] 규모와 [MASK] 원인 등에 대해 다양한 [MASK]이 제기되고 있는 가운데 [MASK]에 설치된 [MASK](착륙 유도 안전시설)가 [MASK]를 키웠다는 [MASK]이 나오고 있다."
+result = mbm.modern_bert_convert_with_multiple_masks(text, top_k=1)
+'30일 전남 무안국제터미널 활주로에 전날 발생한 제주항공 사고 당시 기체가 무단착륙하면서 강한 마찰로 생긴 흔적이 남아 있다. 이 참사로 승객과 승무원 181명 중 179명이 숨지고 일부는 형체를 알아볼 수 없이 실종됐다. 사고 규모와 사고 원인 등에 대해 다양한 의혹이 제기되고 있는 가운데 기내에 설치된 ESC(착륙 유도 안전시설)가 사고를 키웠다는 주장이 나오고 있다.'
+```
+```
+text = "중국의 수도는 [MASK]이다"
+result = mbm.modern_bert_convert_with_multiple_masks(text, top_k=1)
+'중국의 수도는 베이징이다'
+text = "일본의 수도는 [MASK]이다"
+result = mbm.modern_bert_convert_with_multiple_masks(text, top_k=1)
+'일본의 수도는 도쿄이다'
+text = "대한민국의 가장 큰 도시는 [MASK]이다"
+result = mbm.modern_bert_convert_with_multiple_masks(text, top_k=1)
+'대한민국의 가장 큰 도시는 서울이다'
+```
 ### Framework versions
+- Transformers 4.48.0
 - Pytorch 2.5.1+cu124
 - Datasets 3.2.0
+- Tokenizers 0.21.0