EssentialAI
/

eai-distill-0.5b

Model card Files Files and versions

Research-EAI commited on Jun 16

Commit

1150f1a

·

verified ·

1 Parent(s): e5c15c1

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
 license: apache-2.0
 ---
-# 🏷️ EAI-Taxonomy-0.5b
 ## 📋 Model Description
-EAI-Taxonomy-0.5b is a fine-tuned version of Qwen2.5-0.5B-Instruct designed for document classification across 12 taxonomic categories. This model is optimized for high-throughput classification of web documents and produces structured metadata for large-scale dataset curation.
 The model classifies documents across the following dimensions:
 - **📚 Free Decimal Correspondence (FDC)**: Subject matter classification based on the Dewey Decimal System
@@ -35,8 +35,8 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
 import random
 # Load model and tokenizer
-tokenizer = AutoTokenizer.from_pretrained("your-org/EAI-Taxonomy-0.5b", trust_remote_code=True)
-model = AutoModelForCausalLM.from_pretrained("your-org/EAI-Taxonomy-0.5b")
 def chunk_text(text, max_char_per_doc=30000):
     if len(text) <= max_char_per_doc:

 ---
 license: apache-2.0
 ---
+# 🏷️ EAI-Distill-0.5b
 ## 📋 Model Description
+EAI-Distill-0.5b is a fine-tuned version of Qwen2.5-0.5B-Instruct designed for document classification across 12 taxonomic categories. This model is optimized for high-throughput classification of web documents and produces structured metadata for large-scale dataset curation.
 The model classifies documents across the following dimensions:
 - **📚 Free Decimal Correspondence (FDC)**: Subject matter classification based on the Dewey Decimal System
 import random
 # Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("EssentialAI/EAI-Distill-0.5b", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("EssentialAI/EAI-Distill-0.5b")
 def chunk_text(text, max_char_per_doc=30000):
     if len(text) <= max_char_per_doc: