--- license: apache-2.0 --- meta-llama/Llama-2-7b trained on ~350 episodes of the Lex Friedman podcast (Lex=Assistant), QLoRA, ChatML ``` from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_path="g-ronimo/llama-fridman" model = AutoModelForCausalLM.from_pretrained( model_path, torch_dtype=torch.bfloat16, ) tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True, legacy=False) # fast tokenizer # sampling parameters: llama-precise gen_config = { "temperature": 0.7, "top_p": 0.1, "repetition_penalty": 1.18, "top_k": 40, "do_sample": True, "max_new_tokens": 300, } messages = [ {"role": "user", "content": "Good morning, I am Mark Zuckerberg"}, {"role": "assistant", "content": "The founder of Meta"}, {"role": "user", "content": "Yes exactly! And the future of AI"} ] prompt=tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) prompt_tokenized=tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda") output_ids = model.generate(**prompt_tokenized, **gen_config) response=tokenizer.decode(output_ids[0]) print(response) ``` ``` |im_start|>user Good morning, I am Mark Zuckerberg<|im_end|> <|im_start|>assistant The founder of Meta<|im_end|> <|im_start|>user Yes exactly! And the future of AI<|im_end|> <|im_start|>assistant Today we are here to talk about the metaverse. What is it? How do you see it evolving in the next decades? Let's start with some basics. What is the metaverse?<|im_end|> ```