magicsquares137 commited on
Commit
f245600
·
verified ·
1 Parent(s): 23106c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -54,6 +54,83 @@ Note some portions of conversation redacted for public README purposes
54
  **You:** no tell me about it
55
  **Sophie:** 😊 Sure thing! Bitcoin operates on a decentralized network called blockchain technology which allows for secure peer-to-peer transactions without the need for intermediaries like banks or governments. Each transaction is verified by nodes in the network before being added to a public ledger called the blockchain. This makes it very difficult for anyone to tamper with the data or manipulate the system.
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
  ## Model Description
59
 
 
54
  **You:** no tell me about it
55
  **Sophie:** 😊 Sure thing! Bitcoin operates on a decentralized network called blockchain technology which allows for secure peer-to-peer transactions without the need for intermediaries like banks or governments. Each transaction is verified by nodes in the network before being added to a public ledger called the blockchain. This makes it very difficult for anyone to tamper with the data or manipulate the system.
56
 
57
+ ## Hardware Requirements
58
+
59
+ The model has been tested with different precision settings to determine the minimum hardware requirements:
60
+
61
+ | Precision | VRAM Allocated | VRAM Reserved | Suitable GPUs | Notes |
62
+ |-----------|----------------|---------------|---------------|-------|
63
+ | FP32 (32-bit) | 14.24 GB | 14.24 GB | RTX 3090, 4090, A5000, A6000, etc. | Default loading mode |
64
+ | FP16 (16-bit) | 7.12 GB | 21.35 GB | RTX 3090, 4090, A5000, A6000, etc. | Recommended for most users |
65
+ | 8-bit Quantization | 5.68 GB | 6.14 GB | RTX 2060 12GB, 3060, 3070, etc. | Good balance of quality and efficiency |
66
+ | 4-bit Quantization | 2.27 GB | 2.30 GB | Most modern GPUs (GTX 1060+) | Lowest quality, runs on older hardware |
67
+
68
+ ### Recommended Loading Code
69
+
70
+ **For high-end GPUs (FP16):**
71
+ ```python
72
+ import torch
73
+ from transformers import AutoModelForCausalLM, AutoTokenizer
74
+
75
+ # Load in half precision for best balance of performance and quality
76
+ tokenizer = AutoTokenizer.from_pretrained("magicsquares137/phi3-uncensored-chat")
77
+ model = AutoModelForCausalLM.from_pretrained(
78
+ "magicsquares137/phi3-uncensored-chat",
79
+ torch_dtype=torch.float16,
80
+ device_map="auto"
81
+ )
82
+ ```
83
+
84
+ **For mid-range GPUs (8-bit):**
85
+ ```python
86
+ import torch
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
88
+
89
+ # 8-bit quantization config
90
+ quantization_config = BitsAndBytesConfig(
91
+ load_in_8bit=True,
92
+ llm_int8_threshold=6.0
93
+ )
94
+
95
+ # Load in 8-bit
96
+ tokenizer = AutoTokenizer.from_pretrained("magicsquares137/phi3-uncensored-chat")
97
+ model = AutoModelForCausalLM.from_pretrained(
98
+ "magicsquares137/phi3-uncensored-chat",
99
+ quantization_config=quantization_config,
100
+ device_map="auto"
101
+ )
102
+ ```
103
+
104
+ **For low-end GPUs (4-bit):**
105
+ ```python
106
+ import torch
107
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
108
+
109
+ # 4-bit quantization config
110
+ quantization_config = BitsAndBytesConfig(
111
+ load_in_4bit=True,
112
+ bnb_4bit_compute_dtype=torch.float16
113
+ )
114
+
115
+ # Load in 4-bit
116
+ tokenizer = AutoTokenizer.from_pretrained("magicsquares137/phi3-uncensored-chat")
117
+ model = AutoModelForCausalLM.from_pretrained(
118
+ "magicsquares137/phi3-uncensored-chat",
119
+ quantization_config=quantization_config,
120
+ device_map="auto"
121
+ )
122
+ ```
123
+
124
+ **For CPU-only inference** (much slower but works on any system):
125
+ ```python
126
+ model = AutoModelForCausalLM.from_pretrained(
127
+ "magicsquares137/phi3-uncensored-chat",
128
+ device_map="cpu"
129
+ )
130
+ ```
131
+
132
+ Note: Lower precision (8-bit and 4-bit) may result in slightly reduced output quality, but the difference is often minimal for most use cases.
133
+
134
 
135
  ## Model Description
136