ibndias's picture
Adding Evaluation Results (#1)
1a7695f verified
---
language:
- en
license: apache-2.0
tags:
- merge
model-index:
- name: NeuralHermes-MoE-2x7B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 62.12
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 84.21
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 64.56
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 43.61
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 78.14
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 51.86
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
name: Open LLM Leaderboard
---
# NeuralHermes-MoE-2x7B
This is a mix between teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-3.
Using mistralai/Mistral-7B-v0.1 as the base model.
This Mixture of Expert was done using `mergekit` method.
# Getting Started
```python
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="ibndias/NeuralHermes-MoE-2x7B",torch_dtype=torch.bfloat16, device_map="auto")
prompt = """<|system|> You are a helpful assistant.
<|user|>
Write me bash script to scan ip 192.3.1.4 with nmap only port that ends with 9 from 1-100.
<|assistant|>
"""
outputs = pipe(prompt, max_new_tokens=512, do_sample=True, temperature=0.2, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
Output:
><|system|> You are a helpful assistant.
><|user|>
>Write me bash script to scan ip 192.3.1.4 with nmap only port that ends with 9 from 1-100.
><|assistant|>
>Sure, here's a bash script that scans the specified IP address with nmap for open ports that end with 9 from 1 to 100:
>```
>#!/bin/bash
>IP_ADDRESS="192.3.1.4"
>START_PORT=1
>END_PORT=100
>for ((i=$START_PORT; i<=$END_PORT; i++)); do
> PORT=$i
> if [[ $PORT % 10 == 9 ]]; then
> nmap -p $PORT $IP_ADDRESS
> fi
>done
>```
>Save the script with a.sh extension (e.g., scan_ports.sh) and make it executable by running `chmod +x scan_ports.sh`. Then, run the script by executing `./scan_ports.sh`.
>...
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ibndias__NeuralHermes-MoE-2x7B)
| Metric |Value|
|---------------------------------|----:|
|Avg. |64.08|
|AI2 Reasoning Challenge (25-Shot)|62.12|
|HellaSwag (10-Shot) |84.21|
|MMLU (5-Shot) |64.56|
|TruthfulQA (0-shot) |43.61|
|Winogrande (5-shot) |78.14|
|GSM8k (5-shot) |51.86|