Leyley-13B-Lora / README.md

Update README.md

5e9aedd over 1 year ago

4.09 kB

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: Leyley-13B-LoRA
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/6gPGJuqNbLXk9mhrvkXo2.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	# Leyley-13B-lora
	Train required to find an usable one (Oh brother...): [1](https://wandb.ai/undis95/leyleytest?workspace=user-undis95) - [2](https://wandb.ai/undis95/leyleytest2?workspace=user-undis95) - [3](https://wandb.ai/undis95/leyleytest2-noro?workspace=user-undis95) - [4](https://wandb.ai/undis95/leyleytest3-noro?workspace=user-undis95) - [5](https://wandb.ai/undis95/leyleytest4-noro?workspace=user-undis95)

	This LoRA was trained on [Noromaid](https://huggingface.co/NeverSleep/Noromaid-13b-v0.1.1) from scratch using a [custom dataset](https://github.com/Undi95/somethingdata) of the game "The Coffin of Andy and Leyley".

	It achieves the following results on the evaluation set:
	- Loss: 1.1214

	## Model description

	LoRA of Andrew and Ashley from the game.

	Only conversation between them is in the dataset, the AI reply in the name of Ashley.

	It was trained in a way that you speak as her brother, but it can be changed with lower weight, custom system prompt or custom card.

	## Prompt template
	```
	### Instruction:
	You are Ashley Graves, sociopathic, brother-obsessed sister of Andrew Graves. In the following chat, you will talk with Andrew. Andrew called you Leyley as a child, and you called him Andy. Andrew does not like being called Andy.

	Andrew: {prompt}

	### Response:
	Ashley:

	### Input:
	Andrew: {input}
	```

	Or

	```
	### Instruction:
	You are Ashley Graves. In the following chat, you will talk with {{user}}.

	{prompt}

	### Response:

	### Input:
	{input}
	```
	## Recommanded settings

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/5-cDhTqMaa83mtsCnwJgC.png)

	Or

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/U9l1-hqLgj6NuHWkXN_mG.png)

	Also, you HAVE to desactivate this :

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/huTSJU8QOYWg-DYKfkR6J.png)

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2.5e-07
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 1.8362 \| 0.03 \| 1 \| 1.7488 \|
	\| 2.035 \| 2.46 \| 80 \| 1.6462 \|
	\| 1.5489 \| 4.92 \| 160 \| 1.4901 \|
	\| 1.4392 \| 7.38 \| 240 \| 1.3567 \|
	\| 1.2196 \| 9.85 \| 320 \| 1.2475 \|
	\| 1.3219 \| 12.31 \| 400 \| 1.2089 \|
	\| 1.2171 \| 14.77 \| 480 \| 1.1870 \|
	\| 1.1686 \| 17.23 \| 560 \| 1.1730 \|
	\| 1.1506 \| 19.69 \| 640 \| 1.1615 \|
	\| 1.1829 \| 22.15 \| 720 \| 1.1513 \|
	\| 1.267 \| 24.62 \| 800 \| 1.1454 \|
	\| 1.0857 \| 27.08 \| 880 \| 1.1367 \|
	\| 1.0795 \| 29.54 \| 960 \| 1.1345 \|
	\| 1.0453 \| 32.0 \| 1040 \| 1.1317 \|
	\| 1.2093 \| 34.46 \| 1120 \| 1.1283 \|
	\| 1.1442 \| 36.92 \| 1200 \| 1.1253 \|
	\| 0.966 \| 39.38 \| 1280 \| 1.1239 \|
	\| 0.9576 \| 41.85 \| 1360 \| 1.1227 \|
	\| 1.0146 \| 44.31 \| 1440 \| 1.1222 \|
	\| 1.0243 \| 46.77 \| 1520 \| 1.1213 \|
	\| 1.0192 \| 49.23 \| 1600 \| 1.1214 \|


	### Framework versions

	- Transformers 4.36.0.dev0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.7
	- Tokenizers 0.15.0