---
base_model:
- google/gemma-2-2b-jpn-it
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
- sft
license: apache-2.0
language:
- en
datasets:
- O1-OPEN/OpenO1-SFT
- jaeyong2/ja-rag-cot
---
# Uploaded model
- **Developed by:** fyuuki0jp
- **License:** apache-2.0
- **Finetuned from model :** fyuuki0jp/gemma-2-2b-jpn-it-shortwindow
This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[](https://github.com/unslothai/unsloth)
TODO:
- [x] custom token training in sft
- [ ] thinking process customize
- [ ] token prediction process customize
- [ ] tokenizer prompt format change
- [ ] dpo optimization
# Usage
```
system {system_prompt}
user {user_prompt}
model
```
# model response format
This model has been fine-tuned to perform CoT within and to provide the final answer within .
(Caution: This model may sometimes produce incorrect CoT, so please be aware of that.)
```
model inference
```
## example prompts and model output(japanese)
```
system
あなたは親切なアシスタントです。ユーザーの質問に対して親切に回答してください
user
A~Dの中で3人は正直者で、1人が嘘つきです。嘘つきはだれでしょう?
A:Dはうそをついている。
B:僕はうそをついていない。
C:Aはうそをついていない。
D:Bはうそをついている。
model
Alright, I need to determine who among the four individuals (A, B, C, D) is the liar based on the given statements. Let's break down each statement and analyze them step by step.
**Statement 1:** A: "D is lying."
* This statement implies that someone other than D is telling the truth. Since there are only four people, this means either A is truthful or D is not being honest. However, if D were lying, then according to Statement 1, A would be truthful. But in reality, A states that D is lying, so it seems contradictory unless we consider another interpretation.
**Statement 2:** B: "I am not lying."
* If B is telling the truth, they mean they are not lying. So, B must be truthful about their honesty. Therefore, B cannot lie.
**Statement 3:** C: "A is not lying."
* If C is truthful, they mean that A is not lying. In our current scenario, since B is truthful, C can truthfully state that A is not lying. Thus, C also cannot lie.
**Statement 4:** D: "B is not lying."
* If D is truthful, they mean that B is not lying. Now, let's see how all these statements fit together:
- **A**: Not lying → True
- **B**: Lying → False
- **C**: A is not lying → True
- **D**: B is not lying → True
If B is lying, then from Statements 1, 2, and 3, both A and C are truthful. However, if B is lying, then from Statement 4, D is truthful because he believes B is not lying. But wait a second—if B is lying, then from Statement 1, A should be truthful as well. Yet, from Statement 3, C says A is truthful, which contradicts the idea that A is lying if B is lying. This inconsistency suggests that perhaps the initial assumption that B lies might not hold up under closer scrutiny.
Let me try a different approach. Suppose instead of assuming B is lying, what if B tells the truth? Then:
- **A**: Not lying → True
- **B**: Telling the truth → True
- **C**: A is truthful → True
- **D**: B is lying → False
In this case, all statements align correctly with the truths. There's no contradiction here. Therefore, if B is truthful, then everyone else is truthful except for D.
But earlier, when considering whether B was lying, the conclusion led us back to an inconsistent situation where A could have been lying while C remained truthful, leading to contradictions. Perhaps there's a flaw in my reasoning process. Maybe I oversimplified the relationships between the statements.
Alternatively, maybe I made a mistake in interpreting some of the statements. For instance, if Statement 1 ("D is lying") is taken literally, then D is indeed lying. But if interpreted differently, such as implying that D is untruthful, but not necessarily lying outright, things get more complex.
Another angle: If I assume that B's statement is true, then B must be truthful. From Statement 3, C remains truthful regardless of B's statement. Similarly, from Statement 4, D must be truthful. This aligns perfectly with the truthfulness of B and D.
Therefore, after analyzing all possible interpretations and cross-verifying the consistency of the statements, it appears that the liar is D.
```
# RAG Usage
this model in training rag context prompt with custom token (experimental)
```
system {system_prompt}
user {user_prompt}
{rag content}
model
```