Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ license: llama2
|
|
10 |
- Model quality as per my own ad-hoc testing: really good
|
11 |
- A 70b version might be on the way soon.
|
12 |
- Ko-fi link (yes this is a very important "detail at a glance" lol): [https://ko-fi.com/heralax](https://ko-fi.com/heralax)
|
13 |
-
- Substack link (
|
14 |
|
15 |
## Long-form description and essay
|
16 |
The great issue with model training is often the dataset. Model creators can only do so much filtering of the likes of Bluemoon and PIPPA, and in order to advance beyond the quality these can offer, model creators often have to pick through their own chats with bots, manually edit them to be better, and save them -- essentially creating a dataset from scratch. But model creators are not annotators, nor should they be. Manual work isn't scalable, it isn't fun, and it often isn't shareable (because people, sensibly, don't want to share the NSFL chats they have as public data).
|
|
|
10 |
- Model quality as per my own ad-hoc testing: really good
|
11 |
- A 70b version might be on the way soon.
|
12 |
- Ko-fi link (yes this is a very important "detail at a glance" lol): [https://ko-fi.com/heralax](https://ko-fi.com/heralax)
|
13 |
+
- Substack link [here](https://promptingweekly.substack.com/p/human-sourced-ai-augmented-a-promising) (also *highly* important, but no joke I actually wrote about the data generation process for the predecessor of this model on there, so it's kinda relevant. Kinda.)
|
14 |
|
15 |
## Long-form description and essay
|
16 |
The great issue with model training is often the dataset. Model creators can only do so much filtering of the likes of Bluemoon and PIPPA, and in order to advance beyond the quality these can offer, model creators often have to pick through their own chats with bots, manually edit them to be better, and save them -- essentially creating a dataset from scratch. But model creators are not annotators, nor should they be. Manual work isn't scalable, it isn't fun, and it often isn't shareable (because people, sensibly, don't want to share the NSFL chats they have as public data).
|