Askinkaty
/

llama-finance-relations

Text Generation

relation_extraction

Model card Files Files and versions Community

llama-finance-relations / README.md

Askinkaty's picture

Update README.md

3d479ef verified about 2 months ago

|

3.37 kB

	---
	base_model: meta-llama/Llama-3.2-1B-Instruct
	library_name: peft
	license: apache-2.0
	language:
	- en
	metrics:
	- accuracy
	tags:
	- finance
	- relation_extraction
	- relation_types
	---


	## Model Description

	<!-- Provide a longer summary of what this model is. -->


	- Finetuned from model: Llama-3.2-1B-Instruct



	## Downstream Use

	Model for predicting relations between entities in the financial documents.

	### Relation Types
	- no_relation
	- title
	- operations_in
	- employee_of
	- agreement_with
	- formed_on
	- member_of
	- subsidiary_of
	- shares_of
	- revenue_of
	- loss_of
	- headquartered_in
	- acquired_on
	- founder_of
	- formed_in

	## How to Get Started with the Model

	```python
	import torch
	from peft import AutoPeftModelForCausalLM
	from transformers import AutoTokenizer, pipeline

	# Load Model with PEFT adapter

	finetune_name = 'Askinkaty/llama-finance-relations'

	finetined_model = AutoPeftModelForCausalLM.from_pretrained(
	pretrained_model_name_or_path=finetune_name,
	torch_dtype=torch.float16,
	low_cpu_mem_usage=True,
	)


	base_model = "meta-llama/Llama-3.2-1B-Instruct"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	base_model.config.pad_token_id = base_model.config.eos_token_id


	pipeline = pipeline('text-generation', model=base_model, tokenizer=tokenizer)
	pipeline.model = model.to(device)


	```


	## Training Details

	### Training Data

	Samples from [ReFinD dataset](https://refind-re.github.io/). 100 examples for each relation type were used, least frequent relation types are omitted.


	#### Preprocessing

	Dataset is converted into a message format as in the code snippet below:

	```python
	def batch_convert_to_messages(data):

	questions = data.apply(
	lambda x: f"Entity 1: {' '.join(x['token'][x['e1_start']:x['e1_end']])}. "
	f"Entity 2: {' '.join(x['token'][x['e2_start']:x['e2_end']])}. "
	f"Input sentence: {' '.join(x['token'])}",
	axis=1
	)

	relations = data['relation'].apply(lambda relation: relation.split(':')[-1])

	messages = [
	[
	{
	"role": "system",
	"content": "You are an expert in financial documentation and market analysis. Define relations between two specified entities: entity 1 [E1] and entity 2 [E2] in a sentence. Return a short response of the required format. "
	},
	{"role": "user", "content": question},
	{"role": "assistant", "content": relation},
	]
	for question, relation in zip(questions, relations)
	]

	return messages
	```




	#### Training Hyperparameters

	SFT parameters:
	- num_train_epochs=1
	- per_device_train_batch_size=2
	- gradient_accumulation_steps=2
	- gradient_checkpointing=True
	- optim="adamw_torch_fused"
	- learning_rate=2e-4
	- max_grad_norm=0.3
	- warmup_ratio=0.01
	- lr_scheduler_type="cosine"
	- bf16=True

	LORA parameters:
	- rank_dimension = 6
	- lora_alpha = 8
	- lora_dropout = 0.05



	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	Test set sampled from Samples from [ReFinD dataset](https://refind-re.github.io/).


	#### Metrics

	Accuracy. Other metrics: work in progress.

	### Results

	Accuracy before clearning the output: 0.41

	Accuracy after clearning the output: 0.62


	- PEFT 0.14.0