Update README
Browse files
README.md
CHANGED
@@ -12,19 +12,32 @@ license: apache-2.0
|
|
12 |
</div>
|
13 |
</div>
|
14 |
|
15 |
-
## AMchat
|
16 |
AM (Advanced Mathematics) Chat is a large-scale language model that integrates mathematical knowledge, advanced mathematics problems, and their solutions. This model utilizes a dataset that combines Math and advanced mathematics problems with their analyses. It is based on the InternLM2-Math-7B model and has been fine-tuned with xtuner, specifically designed to solve advanced mathematics problems.
|
17 |
|
18 |
|
19 |
## Latest Release
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
- **F16 Quantization**: Achieves a balanced trade-off between model size and performance. Ideal for applications requiring precision with reduced resource consumption.
|
22 |
- **Q8_0 Quantization**: Offers a substantial reduction in model size while maintaining high accuracy, making it suitable for environments with stringent memory constraints.
|
23 |
- **Q4_K_M Quantization**: Provides the most compact model size with minimal impact on performance, perfect for deployment in resource-constrained settings.
|
24 |
|
25 |
-
## Getting Started
|
26 |
|
27 |
-
|
|
|
|
|
28 |
|
29 |
1. **Clone the Repository**
|
30 |
```bash
|
@@ -37,7 +50,6 @@ To get started with AMchat, follow these steps:
|
|
37 |
```bash
|
38 |
ollama create AMchat -f Modelfile
|
39 |
```
|
40 |
-
|
41 |
|
42 |
|
43 |
3. **Run**
|
@@ -45,6 +57,67 @@ To get started with AMchat, follow these steps:
|
|
45 |
ollama run AMchat
|
46 |
```
|
47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
## Star Us
|
50 |
If you find AMchat useful, please ⭐ Star this repository and help others discover it!
|
|
|
12 |
</div>
|
13 |
</div>
|
14 |
|
15 |
+
## AMchat GGUF Model
|
16 |
AM (Advanced Mathematics) Chat is a large-scale language model that integrates mathematical knowledge, advanced mathematics problems, and their solutions. This model utilizes a dataset that combines Math and advanced mathematics problems with their analyses. It is based on the InternLM2-Math-7B model and has been fine-tuned with xtuner, specifically designed to solve advanced mathematics problems.
|
17 |
|
18 |
|
19 |
## Latest Release
|
20 |
|
21 |
+
2024-08-16
|
22 |
+
|
23 |
+
- **Q6_K**
|
24 |
+
- **Q5_K_M**
|
25 |
+
- **Q5_0**
|
26 |
+
- **Q4_0**
|
27 |
+
- **Q3_K_M**
|
28 |
+
- **Q2_K**
|
29 |
+
|
30 |
+
|
31 |
+
2024-08-09
|
32 |
+
|
33 |
- **F16 Quantization**: Achieves a balanced trade-off between model size and performance. Ideal for applications requiring precision with reduced resource consumption.
|
34 |
- **Q8_0 Quantization**: Offers a substantial reduction in model size while maintaining high accuracy, making it suitable for environments with stringent memory constraints.
|
35 |
- **Q4_K_M Quantization**: Provides the most compact model size with minimal impact on performance, perfect for deployment in resource-constrained settings.
|
36 |
|
|
|
37 |
|
38 |
+
## Getting Started - Ollama
|
39 |
+
|
40 |
+
To get started with AMchat in [Ollama](https://github.com/ollama/ollama), follow these steps:
|
41 |
|
42 |
1. **Clone the Repository**
|
43 |
```bash
|
|
|
50 |
```bash
|
51 |
ollama create AMchat -f Modelfile
|
52 |
```
|
|
|
53 |
|
54 |
|
55 |
3. **Run**
|
|
|
57 |
ollama run AMchat
|
58 |
```
|
59 |
|
60 |
+
## Getting Started - llama-cli
|
61 |
+
|
62 |
+
You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
|
63 |
+
|
64 |
+
### Installation
|
65 |
+
|
66 |
+
We recommend building `llama.cpp` from source. The following code snippet provides an example for the Linux CUDA platform. For instructions on other platforms, please refer to the [official guide](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build).
|
67 |
+
|
68 |
+
- Step 1: create a conda environment and install cmake
|
69 |
+
|
70 |
+
```shell
|
71 |
+
conda create --name AMchat python=3.10 -y
|
72 |
+
conda activate AMchat
|
73 |
+
pip install cmake
|
74 |
+
```
|
75 |
+
|
76 |
+
- Step 2: clone the source code and build the project
|
77 |
+
|
78 |
+
```shell
|
79 |
+
git clone --depth=1 https://github.com/ggerganov/llama.cpp.git
|
80 |
+
cd llama.cpp
|
81 |
+
cmake -B build -DGGML_CUDA=ON
|
82 |
+
cmake --build build --config Release -j
|
83 |
+
```
|
84 |
+
|
85 |
+
All the built targets can be found in the sub directory `build/bin`
|
86 |
+
|
87 |
+
In the following sections, we assume that the working directory is at the root directory of `llama.cpp`.
|
88 |
+
|
89 |
+
### Download models
|
90 |
+
|
91 |
+
You can download the appropriate model based on your requirements.
|
92 |
+
For instance, `AMchat-q8_0.gguf` can be downloaded as below:
|
93 |
+
|
94 |
+
```shell
|
95 |
+
pip install huggingface-hub
|
96 |
+
huggingface-cli download axyzdong/AMchat-GGUF AMchat-q8_0.gguf --local-dir . --local-dir-use-symlinks False
|
97 |
+
```
|
98 |
+
|
99 |
+
### chat example
|
100 |
+
|
101 |
+
```shell
|
102 |
+
build/bin/llama-cli \
|
103 |
+
--model AMchat-fp16.gguf \
|
104 |
+
--predict 512 \
|
105 |
+
--ctx-size 4096 \
|
106 |
+
--gpu-layers 24 \
|
107 |
+
--temp 0.8 \
|
108 |
+
--top-p 0.8 \
|
109 |
+
--top-k 50 \
|
110 |
+
--seed 1024 \
|
111 |
+
--color \
|
112 |
+
--prompt "<|im_start|>system\nYou are an expert in advanced math and you can answer all kinds of advanced math problems.<|im_end|>\n" \
|
113 |
+
--interactive \
|
114 |
+
--multiline-input \
|
115 |
+
--conversation \
|
116 |
+
--verbose \
|
117 |
+
--logdir workdir/logdir \
|
118 |
+
--in-prefix "<|im_start|>user\n" \
|
119 |
+
--in-suffix "<|im_end|>\n<|im_start|>assistant\n"
|
120 |
+
```
|
121 |
|
122 |
## Star Us
|
123 |
If you find AMchat useful, please ⭐ Star this repository and help others discover it!
|