Update README.md

3ef5802 verified 15 days ago

3.97 kB

	---
	language:
	- en
	tags:
	- mistral
	- lora
	- adapter
	- fine-tuned
	- politics
	- conversational
	license: mit
	datasets:
	- rohanrao/joe-biden-tweets
	- christianlillelund/joe-biden-2020-dnc-speech
	---

	# Biden Mistral Adapter

	This is a LoRA adapter for the [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) model, fine-tuned to emulate Joe Biden's distinctive speaking style, discourse patterns, and policy positions.

	## Model Details

	- Base Model: [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
	- Model Type: LoRA adapter (Low-Rank Adaptation)
	- LoRA Rank: 16
	- Language: English
	- Training Focus: Emulation of Joe Biden's communication style and response patterns

	## Intended Use

	This model is designed for:
	- Educational and research purposes related to political discourse and communication styles
	- Interactive simulations for understanding political rhetoric
	- Creative applications exploring political communication

	## Training Data

	This adapter was fine-tuned on two key datasets:
	- [Biden tweets dataset (2007-2020)](https://www.kaggle.com/datasets/rohanrao/joe-biden-tweets)
	- [Biden 2020 DNC speech dataset](https://www.kaggle.com/datasets/christianlillelund/joe-biden-2020-dnc-speech)

	These datasets were processed into an instruction format:

	## Training Procedure

	- Framework: Hugging Face Transformers and PEFT
	- Optimization: 4-bit quantization for memory efficiency
	- LoRA Configuration:
	- `r=16`
	- `lora_alpha=64`
	- `lora_dropout=0.05`
	- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	- Training Parameters:
	- Batch size: 4
	- Gradient accumulation steps: 4
	- Learning rate: 2e-4
	- Epochs: 3
	- Learning rate scheduler: cosine
	- Optimizer: paged_adamw_8bit
	- BF16 precision

	## Limitations and Biases

	- The model is designed to mimic a speaking style and may not always provide factually accurate information
	- While it emulates Biden's rhetoric, it does not represent his actual views or statements
	- The model may reproduce biases present in the training data
	- Not suitable for production applications requiring factual accuracy without RAG enhancement

	## Usage

	This adapter should be applied to the Mistral-7B-Instruct-v0.2 base model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	# Load base model with 4-bit quantization
	base_model_id = "mistralai/Mistral-7B-Instruct-v0.2"
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_use_double_quant=True,
	)

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	quantization_config=bnb_config,
	device_map="auto",
	torch_dtype=torch.float16
	)
	tokenizer = AutoTokenizer.from_pretrained(base_model_id)

	# Apply the adapter
	model = PeftModel.from_pretrained(model, "nnat03/biden-mistral-adapter")

	# Generate a response
	prompt = "What's your vision for America's future?"
	input_text = f"<s>[INST] {prompt} [/INST]"
	inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_length=512, temperature=0.7, do_sample=True)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response.split("[/INST]")[-1].strip())
	```

	## Citation and Acknowledgments

	If you use this model in your research, please cite:

	@misc{nnat03-biden-mistral-adapter,
	author = {nnat03},
	title = {Biden Mistral Adapter},
	year = {2023},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/nnat03/biden-mistral-adapter}}
	}


	## Ethical Considerations

	This model is created for educational and research purposes. It attempts to mimic the speaking style of a public figure but does not represent their actual views or statements. Use responsibly.