Add paper abstract and link to model card (#1)

Browse files

- Add paper abstract and link to model card (e1b5c5fb2792904f3ae5487ec6acd59a2aee5da0)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +32 -4

README.md CHANGED Viewed

@@ -1,7 +1,4 @@
 ---
-license: mit
-language:
-- en
 base_model:
 - distilbert/distilbert-base-uncased
 datasets:
@@ -18,7 +15,10 @@ datasets:
 - allenai/qasc
 - nguyen-brat/worldtree
 - qiaojin/PubMedQA
 library_name: transformers
 tags:
 - text-classification
 - sketch-of-thought
@@ -44,6 +44,7 @@ Unlike conventional Chain of Thought (CoT) approaches that produce verbose reaso
 - **Expert Lexicons**: Leverages domain-specific shorthand, technical symbols, and jargon for precise and efficient communication. Suited for technical disciplines requiring maximum information density.
 ## Loading the Model
 This repository contains the DistilBERT paradigm selection model for the Sketch-of-Thought (SoT) framework. You can load and use it directly with Hugging Face Transformers:
@@ -193,6 +194,8 @@ The SoT package supports multiple output formats:
 - `"vlm"`: Multimodal format for vision-language models
 - `"raw"`: Raw exemplars without formatting
 <details>
   <summary>What's the difference?</summary>
@@ -268,6 +271,31 @@ The SoT package supports multiple output formats:
 SoT supports multiple languages. System prompts and exemplars are automatically loaded in the requested language.
 ## Limitations
 - The model is trained to classify questions into one of three predefined paradigms and may not generalize to tasks outside the training distribution.
@@ -285,7 +313,7 @@ If you find our work helpful, please cite:
       eprint={2503.05179},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
-      url={https://arxiv.org/abs/2503.05179},
 }
 ```

 ---
 base_model:
 - distilbert/distilbert-base-uncased
 datasets:
 - allenai/qasc
 - nguyen-brat/worldtree
 - qiaojin/PubMedQA
+language:
+- en
 library_name: transformers
+license: mit
 tags:
 - text-classification
 - sketch-of-thought
 - **Expert Lexicons**: Leverages domain-specific shorthand, technical symbols, and jargon for precise and efficient communication. Suited for technical disciplines requiring maximum information density.
 ## Loading the Model
 This repository contains the DistilBERT paradigm selection model for the Sketch-of-Thought (SoT) framework. You can load and use it directly with Hugging Face Transformers:
 - `"vlm"`: Multimodal format for vision-language models
 - `"raw"`: Raw exemplars without formatting
 <details>
   <summary>What's the difference?</summary>
 SoT supports multiple languages. System prompts and exemplars are automatically loaded in the requested language.
+## Paradigm Selection Model
+SoT includes a pretrained DistilBERT model for automatic paradigm selection based on the question. The model is available on Hugging Face: [saytes/SoT_DistilBERT](https://huggingface.co/saytes/SoT_DistilBERT)
+## Datasets
+The SoT_DistilBERT model was evaluated on the following datasets:
+| Dataset | HF ID | Subset | Split | Evaluation Type |
+|---------|-------|--------|-------|----------------|
+| GSM8K | [gsm8k](https://huggingface.co/datasets/gsm8k) | main | test | numerical |
+| SVAMP | [ChilleD/SVAMP](https://huggingface.co/datasets/ChilleD/SVAMP) | - | test | numerical |
+| AQUA-RAT | [aqua_rat](https://huggingface.co/datasets/aqua_rat) | - | test | multiple_choice |
+| DROP | [drop](https://huggingface.co/datasets/drop) | - | validation | open |
+| OpenbookQA | [openbookqa](https://huggingface.co/datasets/openbookqa) | - | test | multiple_choice |
+| StrategyQA | [ChilleD/StrategyQA](https://huggingface.co/datasets/ChilleD/StrategyQA) | - | test | yesno |
+| LogiQA | [lucasmccabe/logiqa](https://huggingface.co/datasets/lucasmccabe/logiqa) | default | test | multiple_choice |
+| Reclor | [metaeval/reclor](https://huggingface.co/datasets/metaeval/reclor) | - | validation | multiple_choice |
+| HotPotQA | [hotpot_qa](https://huggingface.co/datasets/hotpot_qa) | distractor | validation | open |
+| MuSiQue-Ans | [dgslibisey/MuSiQue](https://huggingface.co/datasets/dgslibisey/MuSiQue) | - | validation | open |
+| QASC | [allenai/qasc](https://huggingface.co/datasets/allenai/qasc) | - | validation | multiple_choice |
+| Worldtree | [nguyen-brat/worldtree](https://huggingface.co/datasets/nguyen-brat/worldtree) | - | train | multiple_choice |
+| PubMedQA | [qiaojin/PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) | pqa_labeled | train | yesno |
+| MedQA | [bigbio/med_qa](https://huggingface.co/datasets/bigbio/med_qa) | med_qa_en_source | validation | multiple_choice |
 ## Limitations
 - The model is trained to classify questions into one of three predefined paradigms and may not generalize to tasks outside the training distribution.
       eprint={2503.05179},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
+      url={https://hf.co/papers/2503.05179},
 }
 ```