Umean commited on
Commit
00bc5a6
·
verified ·
1 Parent(s): 51837d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -28,6 +28,60 @@ Our B2NER models, trained on B2NERD, outperform GPT-4 by 6.8-12.0 F1 points and
28
  - 💾 Model (LoRA Adapters): Current repo saves the B2NER model LoRA adapter based on InternLM2.5-7B. See [20B model](https://huggingface.co/Umean/B2NER-Internlm2-20B-LoRA) for a 20B adapter.
29
 
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Cite
32
  ```
33
  @article{yang2024beyond,
 
28
  - 💾 Model (LoRA Adapters): Current repo saves the B2NER model LoRA adapter based on InternLM2.5-7B. See [20B model](https://huggingface.co/Umean/B2NER-Internlm2-20B-LoRA) for a 20B adapter.
29
 
30
 
31
+ ## Sample Usage - Quick Demo
32
+ Here we show how to use our provided lora adapter to do quick demo with customized input. You can also refer to github repo's `src/demo.ipynb` to see our examples and reuse for your own demo.
33
+ - Prepare/download our LoRA checkpoint and corresponding backbone model.
34
+ - Load the model & tokenizer.
35
+ ```python
36
+ import torch
37
+ from peft import PeftModel, PeftConfig
38
+ from transformers import AutoModelForCausalLM, AutoTokenizer
39
+
40
+ # Load the base model and tokenizer, use your own path/name
41
+ base_model_path = "/path/to/backbone_model"
42
+ base_model = AutoModelForCausalLM.from_pretrained(base_model_path,
43
+ trust_remote_code=True, torch_dtype=torch.float16)
44
+ tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
45
+
46
+ # Load and apply the PEFT model, point weight path to your own directory where an adapter_config.json is located
47
+ lora_weight_path = "/path/to/adapter"
48
+ config = PeftConfig.from_pretrained(lora_weight_path)
49
+ model = PeftModel.from_pretrained(base_model, lora_weight_path, torch_dtype=torch.bfloat16)
50
+ ```
51
+
52
+ - Set `text` and `labels` for your NER demo. Prepare instructions and generate the answer. Below are an English example and a Chinese example based on our B2NER-InternLM2.5-7B (Both examples are out-of-domain data).
53
+
54
+ ```python
55
+ ## English Example ##
56
+ # Input your own text and target entity labels. The model will extract entities inside provided label set from text.
57
+ text = "what is a good 1990 s romance movie starring kelsy grammer"
58
+ labels = ["movie genre", "year or time period", "movie title", "movie actor", "movie age rating"]
59
+
60
+ instruction_template_en = "Given the label set of entities, please recognize all the entities in the text. The answer format should be \"entity label: entity; entity label: entity\". \nLabel Set: {labels_str} \n\nText: {text} \nAnswer:"
61
+ labels_str = ", ".join(labels)
62
+ final_instruction = instruction_template_en.format(labels_str=labels_str, text=text)
63
+ inputs = tokenizer([final_instruction], return_tensors="pt")
64
+ output = model.generate(**inputs, max_length=500)
65
+ generated_text = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
66
+ print(generated_text.split("Answer:")[-1])
67
+ # year or time period: 1990 s; movie genre: romance; movie actor: kelsy grammer
68
+
69
+
70
+ ## 中文例子 ##
71
+ # 输入您自己的文本和目标实体类别标签。模型将从文本中提取出在提供的标签集内的实体。
72
+ text = "暴雪中国时隔多年之后再次举办了官方比赛,而Moon在星际争霸2中发挥不是很理想,对此Infi感觉Moon是哪里出了问题呢?"
73
+ labels = ["人名", "作品名->文字作品", "作品名->游戏作品", "作品名->影像作品", "组织机构名->政府机构", "组织机构名->公司", "组织机构名->其它", "地名"]
74
+
75
+ instruction_template_zh = "给定实体的标签范围,请识别文本中属于这些标签的所有实体。答案格式为 \"实体标签: 实体; 实体标签: 实体\"。\n标签范围: {labels_str}\n\n文本: {text} \n答案:"
76
+ labels_str = ", ".join(labels)
77
+ final_instruction = instruction_template_zh.format(labels_str=labels_str, text=text)
78
+ inputs = tokenizer([final_instruction], return_tensors="pt")
79
+ output = model.generate(**inputs, max_length=500)
80
+ generated_text = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
81
+ print(generated_text.split("答案:")[-1])
82
+ # 组织机构名->公司: 暴雪中国; 人名: Moon; 作品名->游戏作品: 星际争霸2; 人名: Infi
83
+ ```
84
+
85
  ## Cite
86
  ```
87
  @article{yang2024beyond,