saytes commited on
Commit
fe6495f
·
verified ·
1 Parent(s): 9f7d332

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +266 -1
README.md CHANGED
@@ -4,4 +4,269 @@ language:
4
  - en
5
  base_model:
6
  - distilbert/distilbert-base-uncased
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - en
5
  base_model:
6
  - distilbert/distilbert-base-uncased
7
+ ---
8
+
9
+ # SoT_DistilBERT: Paradigm Selection Model for Sketch-of-Thought
10
+
11
+ [![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
12
+ [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/downloads/)
13
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-orange.svg)](https://pytorch.org/)
14
+ [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/yourusername/sketch-of-thought)
15
+
16
+ ## Loading the Model
17
+
18
+ This repository contains the DistilBERT paradigm selection model for the Sketch-of-Thought (SoT) framework. You can load and use it directly with Hugging Face Transformers:
19
+
20
+ ```python
21
+ from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
22
+ import torch
23
+ import json
24
+
25
+ # Load the model directly from Hugging Face
26
+ model = DistilBertForSequenceClassification.from_pretrained("saytes/SoT_DistilBERT")
27
+ tokenizer = DistilBertTokenizer.from_pretrained("saytes/SoT_DistilBERT")
28
+
29
+ # Define label mapping
30
+ label_mapping = {
31
+ "chunked_symbolism": 0,
32
+ "conceptual_chaining": 1,
33
+ "expert_lexicons": 2
34
+ }
35
+
36
+ # Function to classify questions
37
+ def classify_question(question):
38
+ inputs = tokenizer(question, return_tensors="pt", truncation=True, padding=True)
39
+ outputs = model(**inputs)
40
+ predicted_class = torch.argmax(outputs.logits, dim=1).item()
41
+
42
+ # Reverse mapping to get the paradigm name
43
+ label_mapping_reverse = {v: k for k, v in label_mapping.items()}
44
+ return label_mapping_reverse[predicted_class]
45
+
46
+ # Example usage
47
+ question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"
48
+ paradigm = classify_question(question)
49
+ print(f"Recommended paradigm: {paradigm}") # Output: "chunked_symbolism"
50
+ ```
51
+
52
+ For easier integration, we also provide a complete Python package implementation. See the [GitHub repository](https://github.com/yourusername/sketch-of-thought) or the "Complete Package" section below for details.
53
+
54
+ ## Model Description
55
+
56
+ The SoT_DistilBERT model is a fine-tuned DistilBERT classifier trained to select the optimal reasoning paradigm for a given query based on the Sketch-of-Thought framework.
57
+
58
+ ### Training Data
59
+ The model was trained on approximately 14,200 samples across various reasoning tasks, with each sample labeled using one of the three SoT paradigms. Labels were assigned using GPT-4o with a classification-specific prompt based on predefined heuristics.
60
+
61
+ ### Model Architecture
62
+ - **Base model**: DistilBERT
63
+ - **Training**: 5 epochs, batch size 64, learning rate 2e-5
64
+ - **Loss**: Cross-entropy
65
+
66
+ ## What is Sketch-of-Thought?
67
+
68
+ Sketch-of-Thought (SoT) is a novel prompting framework for efficient reasoning in language models that combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize output token usage while preserving reasoning accuracy.
69
+
70
+ Unlike conventional Chain of Thought (CoT) approaches that produce verbose reasoning chains, SoT implements three distinct reasoning paradigms:
71
+
72
+ - **Conceptual Chaining**: Connects essential ideas in logical sequences through structured step links. Effective for commonsense reasoning, multi-hop inference, and fact-based recall tasks.
73
+
74
+ - **Chunked Symbolism**: Organizes numerical and symbolic reasoning into structured steps with equations, variables, and arithmetic operations. Excels in mathematical problems and technical calculations.
75
+
76
+ - **Expert Lexicons**: Leverages domain-specific shorthand, technical symbols, and jargon for precise and efficient communication. Suited for technical disciplines requiring maximum information density.
77
+
78
+ ## Complete Package
79
+
80
+ For a more streamlined experience, we've developed the SoT Python package that handles paradigm selection, prompt management, and exemplar formatting:
81
+
82
+ ```python
83
+ from sketch_of_thought import SoT
84
+
85
+ # Initialize SoT
86
+ sot = SoT()
87
+
88
+ # Classify a question and get appropriate paradigm
89
+ question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"
90
+ paradigm = sot.classify_question(question) # Returns: 'chunked_symbolism'
91
+
92
+ # Get initialized context with exemplars for the selected paradigm
93
+ context = sot.get_initialized_context(
94
+ paradigm=paradigm,
95
+ question=question,
96
+ format="llm",
97
+ include_system_prompt=True
98
+ )
99
+
100
+ # Use with your LLM of choice
101
+ ```
102
+
103
+ ## Example with Qwen2.5-7B
104
+
105
+ Here's a complete example using Qwen2.5-7B-Instruct:
106
+
107
+ ```python
108
+ from transformers import AutoModelForCausalLM, AutoTokenizer
109
+ from sketch_of_thought import SoT
110
+
111
+ # Initialize SoT
112
+ sot = SoT()
113
+
114
+ # Load Qwen model
115
+ model_name = "Qwen/Qwen2.5-7B-Instruct"
116
+ model = AutoModelForCausalLM.from_pretrained(
117
+ model_name,
118
+ torch_dtype="auto",
119
+ device_map="auto"
120
+ )
121
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
122
+
123
+ # Prepare the question
124
+ prompt = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?"
125
+
126
+ # Classify and get appropriate context
127
+ paradigm = sot.classify_question(prompt)
128
+ messages = sot.get_initialized_context(
129
+ paradigm,
130
+ prompt,
131
+ format="llm",
132
+ include_system_prompt=True
133
+ )
134
+
135
+ # Format for the model
136
+ text = tokenizer.apply_chat_template(
137
+ messages,
138
+ tokenize=False,
139
+ add_generation_prompt=True
140
+ )
141
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
142
+
143
+ # Generate response
144
+ generated_ids = model.generate(
145
+ **model_inputs,
146
+ max_new_tokens=512
147
+ )
148
+ generated_ids = [
149
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
150
+ ]
151
+
152
+ # Decode response
153
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
154
+ print(response)
155
+ ```
156
+
157
+ **Output:**
158
+
159
+ ```
160
+ <think>
161
+ A = 5
162
+ A -= 3
163
+ A = 2
164
+ </think>
165
+
166
+ \boxed{2}
167
+ ```
168
+
169
+ ## Supported Formats
170
+
171
+ The SoT package supports multiple output formats:
172
+
173
+ - `"llm"`: Standard chat format for text-only LLMs
174
+ - `"vlm"`: Multimodal format for vision-language models
175
+ - `"raw"`: Raw exemplars without formatting
176
+
177
+ <details>
178
+ <summary>What's the difference?</summary>
179
+
180
+ ### LLM Format
181
+
182
+ Standard `messages` format for Large Language Models.
183
+
184
+ ```python
185
+ [
186
+ {
187
+ "role": "system",
188
+ "content": "SYSTEM_PROMPT_HERE"
189
+ },
190
+ {
191
+ "role": "user",
192
+ "content": "EXAMPLE_QUESTION_HERE"
193
+ },
194
+ {
195
+ "role": "assistant",
196
+ "content": "EXAMPLE_ANSWER_HERE"
197
+ },
198
+ {
199
+ "role": "user",
200
+ "content": "USER_QUESTION_HERE"
201
+ }
202
+ ]
203
+ ```
204
+
205
+ ### VLM Format
206
+
207
+ Standard `messages` format for Large Vision-Language Models.
208
+
209
+ ```python
210
+ [
211
+ {
212
+ "role": "system",
213
+ "content": "SYSTEM_PROMPT_HERE"
214
+ },
215
+ {
216
+ "role": "user",
217
+ "content": [{"type": "text", "text": "EXAMPLE_QUESTION_HERE"}]
218
+ },
219
+ {
220
+ "role": "assistant",
221
+ "content": [{"type": "text", "text": "EXAMPLE_ANSWER_HERE"}]
222
+ },
223
+ {
224
+ "role": "user",
225
+ "content": [{"type": "text", "text": "USER_QUESTION_HERE"}]
226
+ }
227
+ ]
228
+ ```
229
+
230
+ ### Raw Format
231
+
232
+ Raw exemplar data. Apply your own format!
233
+
234
+ ```python
235
+ [
236
+ {
237
+ "question": "EXAMPLE_QUESTION_HERE",
238
+ "answer": "EXAMPLE_ANSWER_HERE"
239
+ },
240
+ {
241
+ "question": "EXAMPLE_QUESTION_HERE",
242
+ "answer": "EXAMPLE_ANSWER_HERE"
243
+ }
244
+ ]
245
+ ```
246
+ </details>
247
+
248
+ ## Multilingual Support
249
+
250
+ SoT supports multiple languages. System prompts and exemplars are automatically loaded in the requested language.
251
+
252
+ ## Limitations
253
+
254
+ - The model is trained to classify questions into one of three predefined paradigms and may not generalize to tasks outside the training distribution.
255
+ - Performance may vary depending on the complexity and domain of the question.
256
+
257
+ ## Citation
258
+
259
+ If you find our work helpful, please cite:
260
+
261
+ ```
262
+ @article{sot2025,
263
+ title={TITLE-HERE},
264
+ author={NAMES-HERE},
265
+ journal={arXiv preprint},
266
+ year={2025}
267
+ }
268
+ ```
269
+
270
+ ## License
271
+
272
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.