axyzdong commited on
Commit
bf42b35
·
1 Parent(s): 9044bf3

Update README

Browse files
Files changed (1) hide show
  1. README.md +77 -4
README.md CHANGED
@@ -12,19 +12,32 @@ license: apache-2.0
12
  </div>
13
  </div>
14
 
15
- ## AMchat
16
  AM (Advanced Mathematics) Chat is a large-scale language model that integrates mathematical knowledge, advanced mathematics problems, and their solutions. This model utilizes a dataset that combines Math and advanced mathematics problems with their analyses. It is based on the InternLM2-Math-7B model and has been fine-tuned with xtuner, specifically designed to solve advanced mathematics problems.
17
 
18
 
19
  ## Latest Release
20
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  - **F16 Quantization**: Achieves a balanced trade-off between model size and performance. Ideal for applications requiring precision with reduced resource consumption.
22
  - **Q8_0 Quantization**: Offers a substantial reduction in model size while maintaining high accuracy, making it suitable for environments with stringent memory constraints.
23
  - **Q4_K_M Quantization**: Provides the most compact model size with minimal impact on performance, perfect for deployment in resource-constrained settings.
24
 
25
- ## Getting Started
26
 
27
- To get started with AMchat, follow these steps:
 
 
28
 
29
  1. **Clone the Repository**
30
  ```bash
@@ -37,7 +50,6 @@ To get started with AMchat, follow these steps:
37
  ```bash
38
  ollama create AMchat -f Modelfile
39
  ```
40
-
41
 
42
 
43
  3. **Run**
@@ -45,6 +57,67 @@ To get started with AMchat, follow these steps:
45
  ollama run AMchat
46
  ```
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  ## Star Us
50
  If you find AMchat useful, please ⭐ Star this repository and help others discover it!
 
12
  </div>
13
  </div>
14
 
15
+ ## AMchat GGUF Model
16
  AM (Advanced Mathematics) Chat is a large-scale language model that integrates mathematical knowledge, advanced mathematics problems, and their solutions. This model utilizes a dataset that combines Math and advanced mathematics problems with their analyses. It is based on the InternLM2-Math-7B model and has been fine-tuned with xtuner, specifically designed to solve advanced mathematics problems.
17
 
18
 
19
  ## Latest Release
20
 
21
+ 2024-08-16
22
+
23
+ - **Q6_K**
24
+ - **Q5_K_M**
25
+ - **Q5_0**
26
+ - **Q4_0**
27
+ - **Q3_K_M**
28
+ - **Q2_K**
29
+
30
+
31
+ 2024-08-09
32
+
33
  - **F16 Quantization**: Achieves a balanced trade-off between model size and performance. Ideal for applications requiring precision with reduced resource consumption.
34
  - **Q8_0 Quantization**: Offers a substantial reduction in model size while maintaining high accuracy, making it suitable for environments with stringent memory constraints.
35
  - **Q4_K_M Quantization**: Provides the most compact model size with minimal impact on performance, perfect for deployment in resource-constrained settings.
36
 
 
37
 
38
+ ## Getting Started - Ollama
39
+
40
+ To get started with AMchat in [Ollama](https://github.com/ollama/ollama), follow these steps:
41
 
42
  1. **Clone the Repository**
43
  ```bash
 
50
  ```bash
51
  ollama create AMchat -f Modelfile
52
  ```
 
53
 
54
 
55
  3. **Run**
 
57
  ollama run AMchat
58
  ```
59
 
60
+ ## Getting Started - llama-cli
61
+
62
+ You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
63
+
64
+ ### Installation
65
+
66
+ We recommend building `llama.cpp` from source. The following code snippet provides an example for the Linux CUDA platform. For instructions on other platforms, please refer to the [official guide](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build).
67
+
68
+ - Step 1: create a conda environment and install cmake
69
+
70
+ ```shell
71
+ conda create --name AMchat python=3.10 -y
72
+ conda activate AMchat
73
+ pip install cmake
74
+ ```
75
+
76
+ - Step 2: clone the source code and build the project
77
+
78
+ ```shell
79
+ git clone --depth=1 https://github.com/ggerganov/llama.cpp.git
80
+ cd llama.cpp
81
+ cmake -B build -DGGML_CUDA=ON
82
+ cmake --build build --config Release -j
83
+ ```
84
+
85
+ All the built targets can be found in the sub directory `build/bin`
86
+
87
+ In the following sections, we assume that the working directory is at the root directory of `llama.cpp`.
88
+
89
+ ### Download models
90
+
91
+ You can download the appropriate model based on your requirements.
92
+ For instance, `AMchat-q8_0.gguf` can be downloaded as below:
93
+
94
+ ```shell
95
+ pip install huggingface-hub
96
+ huggingface-cli download axyzdong/AMchat-GGUF AMchat-q8_0.gguf --local-dir . --local-dir-use-symlinks False
97
+ ```
98
+
99
+ ### chat example
100
+
101
+ ```shell
102
+ build/bin/llama-cli \
103
+ --model AMchat-fp16.gguf \
104
+ --predict 512 \
105
+ --ctx-size 4096 \
106
+ --gpu-layers 24 \
107
+ --temp 0.8 \
108
+ --top-p 0.8 \
109
+ --top-k 50 \
110
+ --seed 1024 \
111
+ --color \
112
+ --prompt "<|im_start|>system\nYou are an expert in advanced math and you can answer all kinds of advanced math problems.<|im_end|>\n" \
113
+ --interactive \
114
+ --multiline-input \
115
+ --conversation \
116
+ --verbose \
117
+ --logdir workdir/logdir \
118
+ --in-prefix "<|im_start|>user\n" \
119
+ --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
120
+ ```
121
 
122
  ## Star Us
123
  If you find AMchat useful, please ⭐ Star this repository and help others discover it!