Anthonyg5005 commited on
Commit
2a943d9
·
1 Parent(s): b240958
Files changed (1) hide show
  1. ipynb/EXL2_Private_Quant_V3.ipynb +53 -51
ipynb/EXL2_Private_Quant_V3.ipynb CHANGED
@@ -1,18 +1,36 @@
1
  {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  "cells": [
3
  {
4
  "cell_type": "markdown",
5
- "metadata": {
6
- "id": "Ku0ezvyD42ng"
7
- },
8
  "source": [
9
  "#Quantizing huggingface models to exl2\n",
10
  "This version of my exl2 quantize colab creates a single quantizaion to upload privatly.\\\n",
11
  "To calculate an estimate for VRAM size use: [NyxKrage/LLM-Model-VRAM-Calculator](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)\\\n",
12
  "Not all models and architectures are compatible with exl2.\\\n",
13
  "I've only tested with llama-7b and mistral-7b, not sure if higher size models work with free colab.\\\n",
14
- "More stuff in [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts)"
15
- ]
 
 
 
 
 
16
  },
17
  {
18
  "cell_type": "code",
@@ -39,12 +57,6 @@
39
  },
40
  {
41
  "cell_type": "code",
42
- "execution_count": null,
43
- "metadata": {
44
- "cellView": "form",
45
- "id": "8Hl3fQmRLybp"
46
- },
47
- "outputs": [],
48
  "source": [
49
  "#@title Login to HF (Required to upload files)\n",
50
  "#@markdown From my Colab/Kaggle login script on [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts/blob/main/HF%20Login%20Snippet%20Kaggle.py)\n",
@@ -98,16 +110,16 @@
98
  " login(input(\"Enter your HuggingFace (WRITE) token: \"))\n",
99
  " continue\n",
100
  " break"
101
- ]
102
- },
103
- {
104
- "cell_type": "code",
105
- "execution_count": null,
106
  "metadata": {
107
  "cellView": "form",
108
- "id": "NI1LUMD7H-Zx"
109
  },
110
- "outputs": [],
 
 
 
 
111
  "source": [
112
  "#@title ##Choose HF model to download\n",
113
  "#@markdown ###Repo should be formatted as user/repo\n",
@@ -127,16 +139,16 @@
127
  " print(\"Finished converting\")\n",
128
  "#@markdown If model files are stored in a pytorch .bin extention then enable convert_safetensors above.\\\n",
129
  "#@markdown ![Example Image](https://huggingface.co/Anthonyg5005/hf-scripts/resolve/main/ipynb/pytorch-example.jpg \"File extension is .bin\")"
130
- ]
 
 
 
 
 
 
131
  },
132
  {
133
  "cell_type": "code",
134
- "execution_count": null,
135
- "metadata": {
136
- "cellView": "form",
137
- "id": "8anbEbGyNmBI"
138
- },
139
- "outputs": [],
140
  "source": [
141
  "#@title Quantize the model\n",
142
  "#@markdown ###Quantization time will last based on model size\n",
@@ -193,16 +205,16 @@
193
  "else:\n",
194
  " quant = f\"convert.py -i models/{model} -o {model}-exl2-{BPW}bpw-WD -cf {model}-exl2-{BPW}bpw -b {BPW}\"\n",
195
  "!python {quant}"
196
- ]
 
 
 
 
 
 
197
  },
198
  {
199
  "cell_type": "code",
200
- "execution_count": null,
201
- "metadata": {
202
- "cellView": "form",
203
- "id": "XORLS2uPrbma"
204
- },
205
- "outputs": [],
206
  "source": [
207
  "#@title Upload to huggingface privately\n",
208
  "#@markdown You may also set it to public but I'd recommend waiting for my next ipynb that will create mutliple quants and place them all into individual branches.\n",
@@ -213,23 +225,13 @@
213
  "create_repo(f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", private=True)\n",
214
  "HfApi().upload_folder(folder_path=f\"{model}-exl2-{BPW}bpw\", repo_id=f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", repo_type=\"model\", commit_message=\"Upload from Colab automation\")\n",
215
  "print(f\"uploaded to https://huggingface.co/{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\")"
216
- ]
217
- }
218
- ],
219
- "metadata": {
220
- "accelerator": "GPU",
221
- "colab": {
222
- "gpuType": "T4",
223
- "provenance": []
224
- },
225
- "kernelspec": {
226
- "display_name": "Python 3",
227
- "name": "python3"
228
- },
229
- "language_info": {
230
- "name": "python"
231
  }
232
- },
233
- "nbformat": 4,
234
- "nbformat_minor": 0
235
- }
 
1
  {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": [],
7
+ "gpuType": "T4"
8
+ },
9
+ "kernelspec": {
10
+ "name": "python3",
11
+ "display_name": "Python 3"
12
+ },
13
+ "language_info": {
14
+ "name": "python"
15
+ },
16
+ "accelerator": "GPU"
17
+ },
18
  "cells": [
19
  {
20
  "cell_type": "markdown",
 
 
 
21
  "source": [
22
  "#Quantizing huggingface models to exl2\n",
23
  "This version of my exl2 quantize colab creates a single quantizaion to upload privatly.\\\n",
24
  "To calculate an estimate for VRAM size use: [NyxKrage/LLM-Model-VRAM-Calculator](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)\\\n",
25
  "Not all models and architectures are compatible with exl2.\\\n",
26
  "I've only tested with llama-7b and mistral-7b, not sure if higher size models work with free colab.\\\n",
27
+ "#Outdated\n",
28
+ "More recent stuff in [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts)\\\n",
29
+ "If you need to quant a model to exl2 for free, check out the bot from the [Exllama Discord server](https://discord.gg/NSFwVuCjRq)"
30
+ ],
31
+ "metadata": {
32
+ "id": "Ku0ezvyD42ng"
33
+ }
34
  },
35
  {
36
  "cell_type": "code",
 
57
  },
58
  {
59
  "cell_type": "code",
 
 
 
 
 
 
60
  "source": [
61
  "#@title Login to HF (Required to upload files)\n",
62
  "#@markdown From my Colab/Kaggle login script on [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts/blob/main/HF%20Login%20Snippet%20Kaggle.py)\n",
 
110
  " login(input(\"Enter your HuggingFace (WRITE) token: \"))\n",
111
  " continue\n",
112
  " break"
113
+ ],
 
 
 
 
114
  "metadata": {
115
  "cellView": "form",
116
+ "id": "8Hl3fQmRLybp"
117
  },
118
+ "execution_count": null,
119
+ "outputs": []
120
+ },
121
+ {
122
+ "cell_type": "code",
123
  "source": [
124
  "#@title ##Choose HF model to download\n",
125
  "#@markdown ###Repo should be formatted as user/repo\n",
 
139
  " print(\"Finished converting\")\n",
140
  "#@markdown If model files are stored in a pytorch .bin extention then enable convert_safetensors above.\\\n",
141
  "#@markdown ![Example Image](https://huggingface.co/Anthonyg5005/hf-scripts/resolve/main/ipynb/pytorch-example.jpg \"File extension is .bin\")"
142
+ ],
143
+ "metadata": {
144
+ "id": "NI1LUMD7H-Zx",
145
+ "cellView": "form"
146
+ },
147
+ "execution_count": null,
148
+ "outputs": []
149
  },
150
  {
151
  "cell_type": "code",
 
 
 
 
 
 
152
  "source": [
153
  "#@title Quantize the model\n",
154
  "#@markdown ###Quantization time will last based on model size\n",
 
205
  "else:\n",
206
  " quant = f\"convert.py -i models/{model} -o {model}-exl2-{BPW}bpw-WD -cf {model}-exl2-{BPW}bpw -b {BPW}\"\n",
207
  "!python {quant}"
208
+ ],
209
+ "metadata": {
210
+ "id": "8anbEbGyNmBI",
211
+ "cellView": "form"
212
+ },
213
+ "execution_count": null,
214
+ "outputs": []
215
  },
216
  {
217
  "cell_type": "code",
 
 
 
 
 
 
218
  "source": [
219
  "#@title Upload to huggingface privately\n",
220
  "#@markdown You may also set it to public but I'd recommend waiting for my next ipynb that will create mutliple quants and place them all into individual branches.\n",
 
225
  "create_repo(f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", private=True)\n",
226
  "HfApi().upload_folder(folder_path=f\"{model}-exl2-{BPW}bpw\", repo_id=f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", repo_type=\"model\", commit_message=\"Upload from Colab automation\")\n",
227
  "print(f\"uploaded to https://huggingface.co/{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\")"
228
+ ],
229
+ "metadata": {
230
+ "cellView": "form",
231
+ "id": "XORLS2uPrbma"
232
+ },
233
+ "execution_count": null,
234
+ "outputs": []
 
 
 
 
 
 
 
 
235
  }
236
+ ]
237
+ }