|
--- |
|
library_name: transformers |
|
license: cc-by-4.0 |
|
language: |
|
- en |
|
base_model: |
|
- Equall/Saul-7B-Base |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** Ehsan Shareghi, Jiuzhou Han, Paul Burgess |
|
- **Model type:** 7B |
|
- **Language(s) (NLP):** English |
|
- **License:** CC BY 4.0 |
|
- **Finetuned from model:** Saul-7B-Base |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Paper:** https://arxiv.org/pdf/2412.06272 |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
Here's how you can run the model: |
|
|
|
```python |
|
# pip install git+https://github.com/huggingface/transformers.git |
|
# pip install git+https://github.com/huggingface/peft.git |
|
|
|
import torch |
|
from transformers import ( |
|
AutoModelForCausalLM, |
|
AutoTokenizer, |
|
BitsAndBytesConfig |
|
) |
|
from peft import PeftModel |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"Equall/Saul-7B-Base", |
|
quantization_config=BitsAndBytesConfig(load_in_8bit=True), |
|
device_map="auto", |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Equall/Saul-7B-Base") |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
model = PeftModel.from_pretrained( |
|
model, |
|
"auslawbench/Cite-SaulLM-7B", |
|
device_map="auto", |
|
torch_dtype=torch.bfloat16, |
|
) |
|
model.eval() |
|
|
|
fine_tuned_prompt = """ |
|
### Instruction: |
|
{} |
|
|
|
### Input: |
|
{} |
|
|
|
### Response: |
|
{}""" |
|
|
|
example_input="Many of ZAR’s grounds of appeal related to fact finding. Drawing on principles set down in several other courts and tribunals, the Appeal Panel summarised the circumstances in which leave may be granted for a person to appeal from findings of fact: <CASENAME> at [84]." |
|
model_input = fine_tuned_prompt.format("Predict the name of the case that needs to be cited in the text and explain why it should be cited.", example_input, '') |
|
inputs = tokenizer(model_input, return_tensors="pt").to("cuda") |
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=1.0) |
|
output = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(output.split("### Response:")[1].strip().split('>')[0] + '>') |
|
|
|
``` |
|
|
|
## Citation |
|
|
|
**BibTeX:** |
|
|
|
``` |
|
@misc{shareghi2024auslawcite, |
|
title={Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study}, |
|
author={Ehsan Shareghi, Jiuzhou Han, Paul Burgess}, |
|
year={2024}, |
|
eprint={arXiv:2412.06272}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |