lars1234 commited on
Commit
45850ca
·
verified ·
1 Parent(s): 5c496b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -5
README.md CHANGED
@@ -14,11 +14,24 @@ Mistral-Small-24B-Instruct-2501-writer is a fine-tuned version of `mistralai/Mis
14
 
15
  The following table was generated by creating 568 stories based on the same prompts as in the [lars1234/story_writing_benchmark](https://huggingface.co/datasets/lars1234/story_writing_benchmark) dataset and then evaluating them using the benchmark's evaluator models.
16
 
17
- | Model | Average | Grammar & Spelling | Clarity | Logical Connection | Scene Construction | Internal Consistency | Character Consistency | Character Motivation | Sentence Variety | Avoiding Clichés | Natural Dialogue | Avoiding Tropes | Character Depth | Character Interactions | Reader Interest | Plot Resolution |
18
- |-------|---------|-------------------|---------|-------------------|-------------------|---------------------|----------------------|---------------------|-----------------|----------------|-----------------|----------------|----------------|----------------------|----------------|-----------------|
19
- | Mistral-2501 | 49.3% | 82.1% | 63.0% | 57.7% | 56.1% | 67.2% | 50.7% | 44.6% | 57.7% | 24.6% | 42.9% | 28.6% | 35.7% | 45.0% | 54.1% | 35.3% |
20
- | Mistral-Writer | **56.5%** | 83.3% | 64.1% | 64.1% | 62.0% | 73.1% | 54.0% | **49.8%** | **64.4%** | **33.3%** | **51.9%** | 37.4% | **46.4%** | **52.0%** | **63.1%** | **45.3%** |
21
- | Gemma-Ataraxy | 56.1% | **88.8%** | **65.8%** | **66.0%** | **64.1%** | **75.1%** | **54.3%** | 49.2% | 64.0% | 31.2% | 48.3% | **40.0%** | 45.4% | 51.7% | 63.0% | 44.9% |
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  Mistral-Small-24B-Instruct-2501-writer outperforms the base Mistral model across all metrics. Gemma-2-Ataraxy still shows higher creativity in some categories, as seen for example in its better score on "Avoiding Tropes."
24
 
 
14
 
15
  The following table was generated by creating 568 stories based on the same prompts as in the [lars1234/story_writing_benchmark](https://huggingface.co/datasets/lars1234/story_writing_benchmark) dataset and then evaluating them using the benchmark's evaluator models.
16
 
17
+ | Metric | Mistral-2501 | Mistral-Writer | Gemma-Ataraxy |
18
+ |-------|---------|-------------------|---------|
19
+ | Grammar & Spelling | 82.1% | 83.3% | **88.8%** |
20
+ | Clarity | 63.0% | 64.1% | **65.8%** |
21
+ | Logical Connection | 57.7% | 64.1% | **66.0%** |
22
+ | Scene Construction | 56.1% | 62.0% | **64.1%** |
23
+ | Internal Consistency | 67.2% | 73.1% | **75.1%** |
24
+ | Character Consistency | 50.7% | 54.0% | **54.3%** |
25
+ | Character Motivation | 44.6% | **49.8%** | 49.2% |
26
+ | Sentence Variety | 57.7% | **64.4%** | 64.0% |
27
+ | Avoiding Clichés | 24.6% | **33.3%** | 31.2% |
28
+ | Natural Dialogue | 42.9% | **51.9%** | 48.3% |
29
+ | Avoiding Tropes | 28.6% | 37.4% | **40.0%** |
30
+ | Character Depth | 35.7% | **46.4%** | 45.4% |
31
+ | Character Interactions | 45.0% | **52.0%** | 51.7% |
32
+ | Reader Interest | 54.1% | **63.1%** | 63.0% |
33
+ | Plot Resolution | 35.3% | **45.3%** | 44.9% |
34
+ | Average | 49.3% | **56.5%** | 56.1% |
35
 
36
  Mistral-Small-24B-Instruct-2501-writer outperforms the base Mistral model across all metrics. Gemma-2-Ataraxy still shows higher creativity in some categories, as seen for example in its better score on "Avoiding Tropes."
37