|
--- |
|
datasets: |
|
- vikp/python_code_instructions_filtered |
|
--- |
|
|
|
Code llama 7b finetuned for 1 epoch on a subset of the python code instructions dataset. Scores `.62` in humaneval with greedy decoding (matched to code llama pass@1). |
|
|
|
To use in inference, you'll need to set `trust_remote_code = True` to pick up the right rope theta value: |
|
|
|
``` |
|
from transformers import AutoModelForCausalLM |
|
from transformers import AutoTokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("vikp/llama_coder") |
|
model = AutoModelForCausalLM.from_pretrained("vikp/llama_coder", trust_remote_code=True) |
|
|
|
text = tokenizer.bos_token + """\ |
|
import socket |
|
|
|
def ping_exponential_backoff(host: str):""".lstrip() |
|
|
|
tokens = tokenizer(text, return_tensors="pt") |
|
output = model.generate(**tokens, max_new_tokens=128, do_sample=True, temperature=.1, top_p=1.0) |
|
print(tokenizer.decode(output[0], skip_special_tokens=True).strip()) |
|
``` |
|
|
|
You can duplicate benchmark results with the bigcode eval harness: |
|
|
|
``` |
|
git clone https://github.com/bigcode-project/bigcode-evaluation-harness.git |
|
cd bigcode-evaluation-harness |
|
pip install -e . |
|
``` |
|
|
|
``` |
|
accelerate launch main.py \ |
|
--model vikp/instruct_llama_7b \ |
|
--tasks humaneval \ |
|
--max_length_generation 1024 \ |
|
--temperature 0 \ |
|
--do_sample False \ |
|
--n_samples 1 \ |
|
--precision fp16 \ |
|
--allow_code_execution \ |
|
--save_generations \ |
|
--use_auth_token \ |
|
--trust_remote_code |
|
``` |