Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -27,13 +27,13 @@ Based on the wonderful JAIS adapted models, by G42 and InceptionAI.
 **Reflection Tuning:**
-Inspired by and featuring the Reflection Tuning technique pioneered by Matt Shumer (possibly earlier innovated by the team at Anthropic, MLabbone's Hermes, and the recent OpenAI o1.)
 From the author of the first "reflection tuned" Llama 3.1 8B LLM.
-As per the inspiring model "mattshumer/Reflection-Llama-3.1-70B" (this model was not used in the training process nor as a foundational model, but only served as inspiration) :
-'''
 During sampling, the model will start by outputting reasoning inside and tags, and then once it is satisfied with its reasoning, it will output the final answer inside and tags. Each of these tags are special tokens, trained into the model.
@@ -55,12 +55,12 @@ You are a world-class AI system, capable of complex reasoning and reflection. Re
 what is 2+2?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-'''
 **Example of this model in action:**
 """
-<s> Below is an instruction that describes a task. Think Carefully. Write a response that appropriately completes the request.
 ### Instruction:
 Create a detailed plan for the following farm's transformation to regenerative farming practices:
@@ -193,7 +193,7 @@ Transitioning to regenerative farming is not just about improving yields, but al
 [Words of encouragement]
 Embracing natural farming methods is a journey of learning and adaptation. By gradually reducing synthetic inputs and building soil health, you're not just growing crops, but nurturing the land for future generations. Keep observing, learning, and adapting your practices to create a thriving, sustainable farm.
-</output></s>
 """
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

 **Reflection Tuning:**
+Inspired by and featuring the Reflection Tuning technique (Published by Matt Shumer, seperately innovated by the team at Anthropic, MLabbone's Hermes, and the recent OpenAI o1.)
 From the author of the first "reflection tuned" Llama 3.1 8B LLM.
+As per one of the inspiring model "mattshumer/Reflection-Llama-3.1-70B" (this model was not used in the training process nor as a foundational model, but only served as inspiration) :
+"""
 During sampling, the model will start by outputting reasoning inside and tags, and then once it is satisfied with its reasoning, it will output the final answer inside and tags. Each of these tags are special tokens, trained into the model.
 what is 2+2?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
+"""
 **Example of this model in action:**
 """
+<Beginning Token> Below is an instruction that describes a task. Think Carefully. Write a response that appropriately completes the request.
 ### Instruction:
 Create a detailed plan for the following farm's transformation to regenerative farming practices:
 [Words of encouragement]
 Embracing natural farming methods is a journey of learning and adaptation. By gradually reducing synthetic inputs and building soil health, you're not just growing crops, but nurturing the land for future generations. Keep observing, learning, and adapting your practices to create a thriving, sustainable farm.
+</output></End Token>
 """
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.