|
--- |
|
base_model: meta-llama/Llama-3.2-1B-Instruct |
|
library_name: peft |
|
license: apache-2.0 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
tags: |
|
- finance |
|
- relation_extraction |
|
- relation_types |
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
- **Finetuned from model:** Llama-3.2-1B-Instruct |
|
|
|
|
|
|
|
## Downstream Use |
|
|
|
Model for predicting relations between entities in the financial documents. |
|
|
|
### Relation Types |
|
- no_relation |
|
- title |
|
- operations_in |
|
- employee_of |
|
- agreement_with |
|
- formed_on |
|
- member_of |
|
- subsidiary_of |
|
- shares_of |
|
- revenue_of |
|
- loss_of |
|
- headquartered_in |
|
- acquired_on |
|
- founder_of |
|
- formed_in |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
import torch |
|
from peft import AutoPeftModelForCausalLM |
|
from transformers import AutoTokenizer, pipeline |
|
|
|
# Load Model with PEFT adapter |
|
|
|
finetune_name = 'Askinkaty/llama-finance-relations' |
|
|
|
finetined_model = AutoPeftModelForCausalLM.from_pretrained( |
|
pretrained_model_name_or_path=finetune_name, |
|
torch_dtype=torch.float16, |
|
low_cpu_mem_usage=True, |
|
) |
|
|
|
|
|
base_model = "meta-llama/Llama-3.2-1B-Instruct" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
base_model.config.pad_token_id = base_model.config.eos_token_id |
|
|
|
|
|
pipeline = pipeline('text-generation', model=base_model, tokenizer=tokenizer) |
|
pipeline.model = model.to(device) |
|
|
|
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
Samples from [ReFinD dataset](https://refind-re.github.io/). 100 examples for each relation type were used, least frequent relation types are omitted. |
|
|
|
|
|
#### Preprocessing |
|
|
|
Dataset is converted into a message format as in the code snippet below: |
|
|
|
```python |
|
def batch_convert_to_messages(data): |
|
|
|
questions = data.apply( |
|
lambda x: f"Entity 1: {' '.join(x['token'][x['e1_start']:x['e1_end']])}. " |
|
f"Entity 2: {' '.join(x['token'][x['e2_start']:x['e2_end']])}. " |
|
f"Input sentence: {' '.join(x['token'])}", |
|
axis=1 |
|
) |
|
|
|
relations = data['relation'].apply(lambda relation: relation.split(':')[-1]) |
|
|
|
messages = [ |
|
[ |
|
{ |
|
"role": "system", |
|
"content": "You are an expert in financial documentation and market analysis. Define relations between two specified entities: entity 1 [E1] and entity 2 [E2] in a sentence. Return a short response of the required format. " |
|
}, |
|
{"role": "user", "content": question}, |
|
{"role": "assistant", "content": relation}, |
|
] |
|
for question, relation in zip(questions, relations) |
|
] |
|
|
|
return messages |
|
``` |
|
|
|
|
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
SFT parameters: |
|
- num_train_epochs=1 |
|
- per_device_train_batch_size=2 |
|
- gradient_accumulation_steps=2 |
|
- gradient_checkpointing=True |
|
- optim="adamw_torch_fused" |
|
- learning_rate=2e-4 |
|
- max_grad_norm=0.3 |
|
- warmup_ratio=0.01 |
|
- lr_scheduler_type="cosine" |
|
- bf16=True |
|
|
|
LORA parameters: |
|
- rank_dimension = 6 |
|
- lora_alpha = 8 |
|
- lora_dropout = 0.05 |
|
|
|
|
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
Test set sampled from Samples from [ReFinD dataset](https://refind-re.github.io/). |
|
|
|
|
|
#### Metrics |
|
|
|
Accuracy. Other metrics: work in progress. |
|
|
|
### Results |
|
|
|
Accuracy before clearning the output: 0.41 |
|
|
|
Accuracy after clearning the output: 0.62 |
|
|
|
|
|
- PEFT 0.14.0 |