Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -52,5 +52,39 @@ We would like to take this opportunity to thank
   - BAD: あなたは○○ができます
   - GOOD: あなたは○○をします
 ## Other points to keep in mind
-  If possible, we recommend inferring with llamacpp rather than Transformers.

   - BAD: あなたは○○ができます
   - GOOD: あなたは○○をします
+## Performing inference
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("Local-Novel-LLM-project/Ninja-v1-128k", trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained("Local-Novel-LLM-project/Ninja-v1-128k")
+prompt = "Once upon a time,"
+input_ids = tokenizer.encode(prompt, return_tensors="pt")
+output = model.generate(input_ids, max_length=100, do_sample=True)
+generated_text = tokenizer.decode(output)
+print(generated_text)
+````
+## Merge recipe
+- WizardLM2 - mistralai/Mistral-7B-v0.1
+- NousResearch/Yarn-Mistral-7b-128k - mistralai/Mistral-7B-v0.1
+- Elizezen/Antler-7B - stabilityai/japanese-stablelm-instruct-gamma-7b
+- NTQAI/chatntq-ja-7b-v1.0
+The characteristics of each model are as follows.
+- WizardLM2: High quality multitasking model
+- Yarn-Mistral-7b-128k: Mistral model with 128k context window
+- Antler-7B: Model specialized for novel writing
+- NTQAI/chatntq-ja-7b-v1.0 High quality Japanese specialized model
 ## Other points to keep in mind
+- The training data may be biased. Be careful with the generated sentences.
+- Set trust_remote_code to True for context expansion with YaRN.
+- Memory usage may be large for long inferences.
+- If possible, we recommend inferring with llamacpp rather than Transformers.