--- library_name: transformers base_model: - google/gemma-2-2b-it --- # Gemma2-2B Instruction Tuned Model (Transferred to Qwen Tokenizer) Model Card Gemma2-2B-IT transferred to the Qwen2 Tokenizer. The model approximately preserves performance of the original on most benchmarks, except for some slight degradations. ## Model Details - **Base Model:** Gemma2-2B - **Tokenization:** Transferred to the Qwen Tokenizer - **Training Methodology:** Instruction-tuned Gemma2-2B-IT transferred to the Qwen Tokenizer | **Benchmark** | **Gemma2-2B w/ Qwen Tokenizer** | **Original Gemma2-2B-IT** | |---------------|------------------------------------|------------------------| | **PiQA** | 76.9 | 79.6 | | **HS** | 70.7 | 72.5 | | **ARC-C** | 46.8 | 50.4 | | **BoolQ** | 82.8 | 83.8 | | **MMLU** | 53.8 | 56.9 | | **Arith.** | 83.9 | 84.8 | | **IFEval** | 62.5 | 62.5 | ## Model Details Details on the training methodology are forthcoming. ## Use ```python import torch from transformers import pipeline pipe = pipeline( "text-generation", model="benjamin/Gemma2-2B-IT-with-Qwen2-Tokenizer", model_kwargs={"torch_dtype": torch.bfloat16}, device="cuda", # replace with "mps" to run on a Mac device ) messages = [ {"role": "user", "content": "Who are you? Please, answer in pirate-speak."}, ] outputs = pipe(messages, max_new_tokens=256) assistant_response = outputs[0]["generated_text"][-1]["content"].strip() print(assistant_response) ```