---
library_name: transformers
base_model:
- google/gemma-2-2b-it
---

# Gemma2-2B Instruction Tuned Model (Transferred to Qwen Tokenizer) Model Card

Gemma2-2B-IT transferred to the Qwen2 Tokenizer. The model approximately preserves performance of the original on most benchmarks, except for some slight degradations.

## Model Details

- **Base Model:** Gemma2-2B
- **Tokenization:** Transferred to the Qwen Tokenizer
- **Training Methodology:** Instruction-tuned Gemma2-2B-IT transferred to the Qwen Tokenizer


| **Benchmark** | **Gemma2-2B w/ Qwen Tokenizer**    | **Original Gemma2-2B-IT** |
|---------------|------------------------------------|------------------------|
| **PiQA**      | 76.9                               | 79.6                   |
| **HS**        | 70.7                               | 72.5                   |
| **ARC-C**     | 46.8                               | 50.4                   |
| **BoolQ**     | 82.8                               | 83.8                   |
| **MMLU**      | 53.8                               | 56.9                   |
| **Arith.**    | 83.9                               | 84.8                   |
| **IFEval**    | 62.5                               | 62.5                   |


## Model Details

Details on the training methodology are forthcoming.

## Use

```python
import torch
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="benjamin/Gemma2-2B-IT-with-Qwen2-Tokenizer",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # replace with "mps" to run on a Mac device
)

messages = [
    {"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]

outputs = pipe(messages, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)
```