Model Description
This model is a fine-tuned version of meta-llama/Llama-3.2-1B
optimized for Persona Classifier tasks when given a Detailed Persona. The training was done on argilla/FinePersonas-v0.1
dataset with the 10k records.
- Developed by: Vedant Rajpurohit
- Model type: Causal Language Model
- Language(s): English
- Fine-tuned from model:
meta-llama/Llama-3.2-1B
Direct Use
model_id_new = "Vedant3907/Llama-3.2-1B-PersonaClassifier"
tokenzier = AutoTokenizer.from_pretrained(model_id_new)
model_pretrained = AutoModelForCausalLM.from_pretrained(
model_id_new,
device_map="auto",
torch_dtype="float16")
prompt = """Given the persona give the associated labels:
### Persona:
A social justice activist and blogger focused on anti-colonialism, anti-racism, and media representation, particularly within the context of intersectional people of color experiences.
### Labels:
"""
pipe = pipeline(task="text-generation",
model=model_pretrained,
tokenizer=tokenizer,
max_new_tokens=50,
temperature=0.1,
pad_token_id = tokenizer.eos_token_id)
result = pipe(testing_prompt)
print(extract_labels(result[0]['generated_text']))
#The extract_labels function is to print just the lsit of persona generated by model if sometime it generates random things.
'''
import re
def extract_labels(output_text):
"""
Extracts the list of labels from the generated text.
Args:
output_text (str): The raw output text from the model.
Returns:
list: A list of labels if found, otherwise an empty list.
"""
try:
# Find the content after "Labels:" and extract the list
match = re.search(r"### Labels:\s*(\[.*?\])", output_text)
if match:
labels = eval(match.group(1)) # Convert string representation of list to Python list
if isinstance(labels, list):
return labels
except Exception as e:
print(f"Error extracting labels: {e}")
# Return an empty list if extraction fails
return []
'''
Training Details
Training Procedure
The model was fine-tuned using with LoRA adapters, enabling efficient training. Below are the hyperparameters used:
training_arguments = TrainingArguments(
output_dir=output_dir,
num_train_epochs=3,
per_device_train_batch_size=1,
gradient_accumulation_steps=8,
optim="paged_adamw_32bit",
logging_steps=10,
learning_rate=2e-4,
fp16=True,
bf16=False,
max_grad_norm=0.3,
# max_steps=-1,
warmup_steps=7,
group_by_length=False,
lr_scheduler_type="cosine",
report_to="wandb",
eval_strategy="steps",
eval_steps = 0.2
)
Hardware
- Trained on google colab with its T4 GPU
- Downloads last month
- 148
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Vedant3907/Llama-3.2-1B-PersonaClassifier
Base model
meta-llama/Llama-3.2-1B