--- base_model: - distilbert/distilbert-base-uncased datasets: - openai/gsm8k - ChilleD/SVAMP - deepmind/aqua_rat - ucinlp/drop - allenai/openbookqa - ChilleD/StrategyQA - lucasmccabe/logiqa - metaeval/reclor - hotpotqa/hotpot_qa - dgslibisey/MuSiQue - allenai/qasc - nguyen-brat/worldtree - qiaojin/PubMedQA language: - en library_name: transformers license: mit tags: - text-classification - sketch-of-thought - efficient-inference --- # SoT_DistilBERT: Paradigm Selection Model for Sketch-of-Thought [![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/downloads/) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-orange.svg)](https://pytorch.org/) [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/SimonAytes/SoT) ## What is Sketch-of-Thought? Sketch-of-Thought (SoT) is a novel prompting framework for efficient reasoning in language models that combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize output token usage while preserving reasoning accuracy. Unlike conventional Chain of Thought (CoT) approaches that produce verbose reasoning chains, SoT implements three distinct reasoning paradigms: - **Conceptual Chaining**: Connects essential ideas in logical sequences through structured step links. Effective for commonsense reasoning, multi-hop inference, and fact-based recall tasks. - **Chunked Symbolism**: Organizes numerical and symbolic reasoning into structured steps with equations, variables, and arithmetic operations. Excels in mathematical problems and technical calculations. - **Expert Lexicons**: Leverages domain-specific shorthand, technical symbols, and jargon for precise and efficient communication. Suited for technical disciplines requiring maximum information density. ## Loading the Model This repository contains the DistilBERT paradigm selection model for the Sketch-of-Thought (SoT) framework. You can load and use it directly with Hugging Face Transformers: ```python from transformers import DistilBertTokenizer, DistilBertForSequenceClassification import torch import json # Load the model directly from Hugging Face model = DistilBertForSequenceClassification.from_pretrained("saytes/SoT_DistilBERT") tokenizer = DistilBertTokenizer.from_pretrained("saytes/SoT_DistilBERT") # Define label mapping label_mapping = { "chunked_symbolism": 0, "conceptual_chaining": 1, "expert_lexicons": 2 } # Function to classify questions def classify_question(question): inputs = tokenizer(question, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) predicted_class = torch.argmax(outputs.logits, dim=1).item() # Reverse mapping to get the paradigm name label_mapping_reverse = {v: k for k, v in label_mapping.items()} return label_mapping_reverse[predicted_class] # Example usage question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?" paradigm = classify_question(question) print(f"Recommended paradigm: {paradigm}") # Output: "chunked_symbolism" ``` For easier integration, we also provide a complete Python package implementation. See the [GitHub repository](https://github.com/SimonAytes/SoT) or the "Complete Package" section below for details. ## Model Description The SoT_DistilBERT model is a fine-tuned DistilBERT classifier trained to select the optimal reasoning paradigm for a given query based on the Sketch-of-Thought framework. ### Training Data The model was trained on approximately 14,200 samples across various reasoning tasks, with each sample labeled using one of the three SoT paradigms. Labels were assigned using GPT-4o with a classification-specific prompt based on predefined heuristics. ### Model Architecture - **Base model**: DistilBERT - **Training**: 5 epochs, batch size 64, learning rate 2e-5 - **Loss**: Cross-entropy ## Complete Package For a more streamlined experience, we've developed the SoT Python package that handles paradigm selection, prompt management, and exemplar formatting: ```python from sketch_of_thought import SoT # Initialize SoT sot = SoT() # Classify a question and get appropriate paradigm question = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?" paradigm = sot.classify_question(question) # Returns: 'chunked_symbolism' # Get initialized context with exemplars for the selected paradigm context = sot.get_initialized_context( paradigm=paradigm, question=question, format="llm", include_system_prompt=True ) # Use with your LLM of choice ``` ## Example with Qwen2.5-7B Here's a complete example using Qwen2.5-7B-Instruct: ```python from transformers import AutoModelForCausalLM, AutoTokenizer from sketch_of_thought import SoT # Initialize SoT sot = SoT() # Load Qwen model model_name = "Qwen/Qwen2.5-7B-Instruct" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) # Prepare the question prompt = "Alice has 5 apples. She gives 3 apples to Bob. How many apples does Alice have?" # Classify and get appropriate context paradigm = sot.classify_question(prompt) messages = sot.get_initialized_context( paradigm, prompt, format="llm", include_system_prompt=True ) # Format for the model text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) # Generate response generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] # Decode response response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` **Output:** ``` A = 5 A -= 3 A = 2 \boxed{2} ``` ## Supported Formats The SoT package supports multiple output formats: - `"llm"`: Standard chat format for text-only LLMs - `"vlm"`: Multimodal format for vision-language models - `"raw"`: Raw exemplars without formatting
What's the difference? ### LLM Format Standard `messages` format for Large Language Models. ```python [ { "role": "system", "content": "SYSTEM_PROMPT_HERE" }, { "role": "user", "content": "EXAMPLE_QUESTION_HERE" }, { "role": "assistant", "content": "EXAMPLE_ANSWER_HERE" }, { "role": "user", "content": "USER_QUESTION_HERE" } ] ``` ### VLM Format Standard `messages` format for Large Vision-Language Models. ```python [ { "role": "system", "content": "SYSTEM_PROMPT_HERE" }, { "role": "user", "content": [{"type": "text", "text": "EXAMPLE_QUESTION_HERE"}] }, { "role": "assistant", "content": [{"type": "text", "text": "EXAMPLE_ANSWER_HERE"}] }, { "role": "user", "content": [{"type": "text", "text": "USER_QUESTION_HERE"}] } ] ``` ### Raw Format Raw exemplar data. Apply your own format! ```python [ { "question": "EXAMPLE_QUESTION_HERE", "answer": "EXAMPLE_ANSWER_HERE" }, { "question": "EXAMPLE_QUESTION_HERE", "answer": "EXAMPLE_ANSWER_HERE" } ] ```
## Multilingual Support SoT supports multiple languages. System prompts and exemplars are automatically loaded in the requested language. ## Paradigm Selection Model SoT includes a pretrained DistilBERT model for automatic paradigm selection based on the question. The model is available on Hugging Face: [saytes/SoT_DistilBERT](https://huggingface.co/saytes/SoT_DistilBERT) ## Datasets The SoT_DistilBERT model was evaluated on the following datasets: | Dataset | HF ID | Subset | Split | Evaluation Type | |---------|-------|--------|-------|----------------| | GSM8K | [gsm8k](https://huggingface.co/datasets/gsm8k) | main | test | numerical | | SVAMP | [ChilleD/SVAMP](https://huggingface.co/datasets/ChilleD/SVAMP) | - | test | numerical | | AQUA-RAT | [aqua_rat](https://huggingface.co/datasets/aqua_rat) | - | test | multiple_choice | | DROP | [drop](https://huggingface.co/datasets/drop) | - | validation | open | | OpenbookQA | [openbookqa](https://huggingface.co/datasets/openbookqa) | - | test | multiple_choice | | StrategyQA | [ChilleD/StrategyQA](https://huggingface.co/datasets/ChilleD/StrategyQA) | - | test | yesno | | LogiQA | [lucasmccabe/logiqa](https://huggingface.co/datasets/lucasmccabe/logiqa) | default | test | multiple_choice | | Reclor | [metaeval/reclor](https://huggingface.co/datasets/metaeval/reclor) | - | validation | multiple_choice | | HotPotQA | [hotpot_qa](https://huggingface.co/datasets/hotpot_qa) | distractor | validation | open | | MuSiQue-Ans | [dgslibisey/MuSiQue](https://huggingface.co/datasets/dgslibisey/MuSiQue) | - | validation | open | | QASC | [allenai/qasc](https://huggingface.co/datasets/allenai/qasc) | - | validation | multiple_choice | | Worldtree | [nguyen-brat/worldtree](https://huggingface.co/datasets/nguyen-brat/worldtree) | - | train | multiple_choice | | PubMedQA | [qiaojin/PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) | pqa_labeled | train | yesno | | MedQA | [bigbio/med_qa](https://huggingface.co/datasets/bigbio/med_qa) | med_qa_en_source | validation | multiple_choice | ## Limitations - The model is trained to classify questions into one of three predefined paradigms and may not generalize to tasks outside the training distribution. - Performance may vary depending on the complexity and domain of the question. ## Citation If you find our work helpful, please cite: ``` @misc{aytes2025sot, title={Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching}, author={Simon A. Aytes and Jinheon Baek and Sung Ju Hwang}, year={2025}, eprint={2503.05179}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://hf.co/papers/2503.05179}, } ``` ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.