Text Generation
GGUF
English
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prosing
vivid writing
fiction
roleplaying
bfloat16
swearing
role play
sillytavern
backyard
horror
llama 3.1
context 128k
mergekit
Inference Endpoints
conversational
Finetune Qwen2.5 14b?
#2
by
ElvisM
- opened
For 16gb VRAM cards, I think this one is probably SOTA for story writing, mostly because, unlike Llama and Mistral, it handles long context pretty well. Any plans to fine-tune it one day? Would be very appreciated. I think I speak for most people that I'm kind of tired of having problems when the context reaches 16k tokens.
This is up next. Very impressed with Qwen 2.5s; however I will tried to emulate this approach (and Brainstorm too):
https://huggingface.co/DavidAU/DeepSeek-Grand-Horror-SMB-R1-Distill-Llama-3.1-16B-GGUF
This takes only the "reasoning/thinking" parts of Deepseek and connects them to the core model.
The core model is fully retained, and augmented with Deepseek's tech.