--- library_name: transformers tags: - merge - sliced - minimalist license: apache-2.0 metrics: - accuracy - bleu --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** Tatman Electric - **Funded by [optional]:** Spare Pocket Lint - **Shared by [optional]:** TRL - **Model type:** Sliced Layered - **Language(s) (NLP):** Mixed - **License:** Pythia @ EleutherAI - **Finetuned from model [optional]:** EleutherAI/pythia-2.8b-deduped ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses Before there were merged models, there were slices of shards of... stuff. Those slices have meaning. Those slices are real slices too. ### Direct Use Part of a series of slice and dice mods. ##### Single Hidden Layer Pythia What does a single hidden layer preserve from a 12 layer base model? [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation | Groups |Version| Filter |n-shot| Metric | Value | |Stderr| |--------------------|-------|----------------|-----:|-----------|------:|---|-----:| |Open LLM Leaderboard|N/A |none | 5|rouge1_max |36.3550|± |0.9462| | | |flexible-extract| 5|exact_match| 0.0220|± |0.0066| | - arc_challenge | 1|none | 25|acc | 0.1760|± |0.0170| | | |none | 25|acc_norm | 0.2320|± |0.0189| | - gsm8k | 3|strict-match | 5|exact_match| 0.0060|± |0.0035| | | |flexible-extract| 5|exact_match| 0.0220|± |0.0066| | - hellaswag | 1|none | 10|acc | 0.3520|± |0.0214| | | |none | 10|acc_norm | 0.4040|± |0.0220| | - winogrande | 1|none | 5|acc | 0.5120|± |0.0224| | | |none | 5|bleu_diff |-0.6500|± |0.6421| | | |none | 5|rouge1_acc | 0.3700|± |0.0216| | | |none | 5|rouge1_diff|-1.5564|± |1.0223| | | |none | 5|acc | 0.2664|± |0.0036| | | |none | 5|rougeL_max |33.8798|± |0.9367| | | |none | 5|rouge2_diff|-3.3178|± |0.9477| | | |none | 5|bleu_max |15.2292|± |0.6714| | | |none | 5|bleu_acc | 0.4360|± |0.0222| | | |none | 5|rouge2_max |16.4873|± |1.0172| | | |none | 5|acc_norm | 0.3180|± |0.0145| | | |strict-match | 5|exact_match| 0.0060|± |0.0035| | | |none | 5|rougeL_diff|-0.7765|± |1.0034| | | |none | 5|rougeL_acc | 0.3860|± |0.0218| | | |none | 5|rouge2_acc | 0.1920|± |0.0176| | - mmlu |N/A |none | 0|acc | 0.2533|± |0.0039| | - humanities |N/A |none | 5|acc | 0.2408|± |0.0075| | - other |N/A |none | 5|acc | 0.2443|± |0.0080| | - social_sciences |N/A |none | 5|acc | 0.2538|± |0.0081| | - stem |N/A |none | 5|acc | 0.2740|± |0.0079| | - truthfulqa |N/A |none | 0|rouge1_max |36.3550|± |0.9462| | | |none | 0|bleu_diff |-0.6500|± |0.6421| | | |none | 0|rouge1_acc | 0.3700|± |0.0216| | | |none | 0|rouge1_diff|-1.5564|± |1.0223| | | |none | 0|acc | 0.3435|± |0.0137| | | |none | 0|rougeL_max |33.8798|± |0.9367| | | |none | 0|bleu_max |15.2292|± |0.6714| | | |none | 0|bleu_acc | 0.4360|± |0.0222| | | |none | 0|rouge2_max |16.4873|± |1.0172| | | |none | 0|rougeL_acc | 0.3860|± |0.0218| | | |none | 0|rougeL_diff|-0.7765|± |1.0034| | | |none | 0|rouge2_acc | 0.1920|± |0.0176| | | |none | 0|rouge2_diff|-3.3178|± |0.9477| ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** OldAsDirt - **Hours used:** 5 - **Cloud Provider:** YourMomsBasement - **Compute Region:** Siberia - **Carbon Emitted:** 8ppm No yaks were harmed in the making of this model. ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]