Update README.md
Browse files
README.md
CHANGED
|
@@ -16,15 +16,6 @@ The model DeepSeek-Qwen-7B is our optimized model for its advanced instruction-f
|
|
| 16 |
|
| 17 |
**Code**: [https://github.com/yuleiqin/RAIF](https://github.com/yuleiqin/RAIF)
|
| 18 |
|
| 19 |
-
## Overview and Framework
|
| 20 |
-
|
| 21 |
-
Our preliminary experiments confirm that the reasoning (e.g., triggered by CoT prompting) of fast-thinking LLMs (instructed models) are often shallow and superficial. Such reasoning only briefly repeats parts of the input requests and fails to extract key components from the complex instructions that are often composed of various sub-instructions, constraints, and rules. On the contrary, existing slow-thinking LLMs (reasoning models) demonstrate superior performance where their deep, organized reasoning truly help the analyses of complex instructions and provide the decomposed action steps to the final answer. Consequently, it is important to incentivize the authentic reasoning of LLMs to solve complex instructions.
|
| 22 |
-
|
| 23 |
-

|
| 24 |
-
|
| 25 |
-
In this project, we present a reinforcement learning-based method for cultivation of the deep reasoning of LLMs.
|
| 26 |
-
|
| 27 |
-

|
| 28 |
|
| 29 |
## Usage
|
| 30 |
|
|
|
|
| 16 |
|
| 17 |
**Code**: [https://github.com/yuleiqin/RAIF](https://github.com/yuleiqin/RAIF)
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Usage
|
| 21 |
|