Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
# Mit-ThinkDeeply-0.5B-GGUF
|
5 |
+
|
6 |
+
## Model Description
|
7 |
+
|
8 |
+
**Mit-ThinkDeeply** is the advanced version of the Mit series of large language models (LLMs) developed by WinkingFace. Built upon the robust foundation of the Mit base model, **Mit-ThinkDeeply** introduces enhanced reasoning capabilities, superior contextual understanding, and refined function-calling precision. This model is designed to seamlessly integrate intuitive conversational abilities with advanced multi-step reasoning, making it ideal for complex analytical tasks, structured problem-solving, and high-stakes decision-making.
|
9 |
+
|
10 |
+
Key features of **Mit-ThinkDeeply** include:
|
11 |
+
|
12 |
+
- **Advanced Reasoning**: Capable of generating long chains of thought to deeply analyze problems and provide well-reasoned solutions.
|
13 |
+
- **Enhanced Contextual Awareness**: Improved ability to maintain coherence across multi-turn conversations and long-form interactions.
|
14 |
+
- **Function Calling Precision**: Optimized for reliable and accurate execution of tool calls, enabling seamless integration with external APIs and services.
|
15 |
+
- **Versatile Use Cases**: Adaptable for both standard conversational tasks and complex reasoning scenarios, including mathematical problem-solving, code generation, and structured output generation.
|
16 |
+
- **Long Context Support**: Supports context lengths of up to 128K tokens, ensuring robust performance in applications requiring extensive input data.
|
17 |
+
|
18 |
+
**Mit-ThinkDeeply** has undergone extensive architectural refinements and fine-tuning to align more effectively with real-world applications. Our training process emphasizes deeper contextual awareness, enhanced response coherence, and improved execution of function-calling, making **Mit-ThinkDeeply** a powerful and versatile AI system.
|
19 |
+
|
20 |
+
|
21 |
+
## Quickstart
|
22 |
+
|
23 |
+
We recommend using **Customized llama.cpp version**.
|
24 |
+
|
25 |
+
```bash
|
26 |
+
git clone https://github.com/WinkingFaceAI/lmc-recooked.git
|
27 |
+
```
|
28 |
+
|
29 |
+
In the following demonstration, we assume that you are running commands under the repository `lmc-recooked`.
|
30 |
+
|
31 |
+
Since cloning the entire repo may be inefficient, you can manually download the GGUF file that you need or use `huggingface-cli`:
|
32 |
+
1. Install
|
33 |
+
```shell
|
34 |
+
pip install -U huggingface_hub
|
35 |
+
```
|
36 |
+
2. Download:
|
37 |
+
```shell
|
38 |
+
huggingface-cli download WinkingFace/Mit-ThinkDeeply-0.5B-gguf Mit-ThinkDeeply-0.5B-q8_0.gguf --local-dir . --local-dir-use-symlinks False
|
39 |
+
```
|
40 |
+
|
41 |
+
For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode:
|
42 |
+
|
43 |
+
```shell
|
44 |
+
./llama-cli -m <gguf-file-path> \
|
45 |
+
-co -cnv -p "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem." \
|
46 |
+
-fa -ngl 80 -n 512
|
47 |
+
```
|
48 |
+
|
49 |
+
|
50 |
+
## Evaluation & Performance
|
51 |
+
|
52 |
+
<div align="center">
|
53 |
+
|
54 |
+
|
55 |
+
| Category | Benchmark (Metric) | Mit-ThinkDeeply-0.5B | Mit-ThinkDeeply-1.5B | Mit-ThinkDeeply-3B | Mit-ThinkDeeply-7B |
|
56 |
+
|----------|--------------------|----------------------|----------------------|--------------------|--------------------|
|
57 |
+
| | Context Length | 32K | 32K | 32K | 128K |
|
58 |
+
| | Generation Length | 8K | 8K | 8K | 8K |
|
59 |
+
| General | MMLU | 45.4 | 58.9 | 63.8 | 72.6 |
|
60 |
+
| | MMLU-pro | 13.8 | 26.6 | 33.0 | 43.7 |
|
61 |
+
| | MMLU-redux | 43.1 | 56.8 | 62.7 | 70.3 |
|
62 |
+
| | BBH | 18.3 | 41.7 | 64.9 | 68.1 |
|
63 |
+
| | ARC-C | 32.9 | 56.0 | 57.5 | 65.8 |
|
64 |
+
| Code | LiveCodeBench | 11.5 | 21.4 | 25.9 | 36.2 |
|
65 |
+
| | HumanEval | 25.4 | 44.6 | 51.6 | 69.5 |
|
66 |
+
| | HumanEval+ | 29.7 | 38.1 | 43.9 | 60.7 |
|
67 |
+
| | MBPP | 46.3 | 74.2 | 69.9 | 82.9 |
|
68 |
+
| | MBPP+ | 36.8 | 59.5 | 59.3 | 70.2 |
|
69 |
+
| | MultiPL-E | 24.9 | 51.7 | 49.6 | 58.1 |
|
70 |
+
| Mathematics | GPQA | 25.1 | 29.0 | 31.5 | 40.7 |
|
71 |
+
| | Theoremqa | 18.2 | 23.2 | 27.9 | 39.4 |
|
72 |
+
| | MATH | 25.4 | 38.1 | 46.7 | 54.8 |
|
73 |
+
| | MATH-500 | 62.5 | 79.2 | 88.4 | 94.6 |
|
74 |
+
| | MMLU-stem | 43.3 | 65.8 | 75.1 | 81.3 |
|
75 |
+
| | GSM8K | 45.8 | 70.1 | 81.5 | 86.2 |
|
76 |
+
|
77 |
+
</div>
|
78 |
+
|
79 |
+
|
80 |
+
## Citation
|
81 |
+
|
82 |
+
```
|
83 |
+
If you find our work helpful, feel free to cite us:
|
84 |
+
|
85 |
+
|
86 |
+
@misc{mit-thinkdeeply,
|
87 |
+
title = {Mit-ThinkDeeply: Advanced Reasoning and Contextual Awareness in Large Language Models},
|
88 |
+
author = {WinkingFace Team},
|
89 |
+
year = {2025},
|
90 |
+
url = {https://huggingface.co/WinkingFace/Mit-ThinkDeeply-7B}
|
91 |
+
}
|
92 |
+
```
|
93 |
+
|
94 |
+
|
95 |
+
## Contact
|
96 |
+
|
97 |
+
For any questions or inquiries, feel free to [contact us here 📨](mailto:[email protected]).
|