--- library_name: transformers license: cc-by-4.0 language: - en base_model: - Equall/Saul-7B-Base --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** Ehsan Shareghi, Jiuzhou Han, Paul Burgess - **Model type:** 7B - **Language(s) (NLP):** English - **License:** CC BY 4.0 - **Finetuned from model:** Saul-7B-Base ### Model Sources - **Paper:** https://arxiv.org/pdf/2412.06272 ## Uses Here's how you can run the model: ```python # pip install git+https://github.com/huggingface/transformers.git # pip install git+https://github.com/huggingface/peft.git import torch from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig ) from peft import PeftModel model = AutoModelForCausalLM.from_pretrained( "Equall/Saul-7B-Base", quantization_config=BitsAndBytesConfig(load_in_8bit=True), device_map="auto", ) tokenizer = AutoTokenizer.from_pretrained("Equall/Saul-7B-Base") tokenizer.pad_token = tokenizer.eos_token model = PeftModel.from_pretrained( model, "auslawbench/Cite-SaulLM-7B", device_map="auto", torch_dtype=torch.bfloat16, ) model.eval() fine_tuned_prompt = """ ### Instruction: {} ### Input: {} ### Response: {}""" model_input = fine_tuned_prompt.format("Predict the name of the case that needs to be cited in the text and explain why it should be cited.", input, '') inputs = tokenizer(model_input, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=256, temperature=1.0) output = tokenizer.decode(outputs[0], skip_special_tokens=True) print(output.split("### Response:")[1].strip().split('>')[0] + '>') ``` ## Citation **BibTeX:** ``` @misc{shareghi2024auslawcite, title={Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study}, author={Ehsan Shareghi, Jiuzhou Han, Paul Burgess}, year={2024}, eprint={arXiv:2412.06272}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```