Triangle104 commited on
Commit
b6b6161
·
verified ·
1 Parent(s): 66eb2ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md CHANGED
@@ -17,6 +17,80 @@ language:
17
  This model was converted to GGUF format from [`Spestly/Athena-1-1.5B`](https://huggingface.co/Spestly/Athena-1-1.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-1.5B) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22
 
 
17
  This model was converted to GGUF format from [`Spestly/Athena-1-1.5B`](https://huggingface.co/Spestly/Athena-1-1.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Spestly/Athena-1-1.5B) for more details on the model.
19
 
20
+ ---
21
+ Model details:
22
+ -
23
+ Athena-1 1.5B is a fine-tuned, instruction-following large language model derived from Qwen/Qwen2.5-1.5B-Instruct.
24
+ Designed for efficiency and high-quality text generation, Athena-1 1.5B
25
+ maintains a compact size, making it ideal for real-world applications
26
+ where performance and resource efficiency are critical, such as
27
+ lightweight applications, conversational AI, and structured data tasks.
28
+
29
+ Key Features
30
+
31
+ ⚡ Lightweight and Efficient
32
+
33
+ Compact Size: At just 1.5 billion parameters, Athena-1 1.5B offers excellent performance with reduced computational requirements.
34
+ Instruction Following: Fine-tuned for precise and reliable adherence to user prompts.
35
+ Coding and Mathematics: Proficient in solving coding challenges and handling mathematical tasks.
36
+
37
+ 📖 Long-Context Understanding
38
+
39
+ Context Length: Supports up to 32,768 tokens, enabling the processing of moderately lengthy documents or conversations.
40
+ Token Generation: Can generate up to 8K tokens of output.
41
+
42
+ 🌍 Multilingual Support
43
+
44
+ Supports 29+ languages, including:
45
+ English, Chinese, French, Spanish, Portuguese, German, Italian, Russian
46
+ Japanese, Korean, Vietnamese, Thai, Arabic, and more.
47
+
48
+ 📊 Structured Data & Outputs
49
+
50
+ Structured Data Interpretation: Processes structured formats like tables and JSON.
51
+ Structured Output Generation: Generates well-formatted outputs, including JSON and other structured formats.
52
+
53
+ Model Details
54
+
55
+ Base Model: Qwen/Qwen2.5-1.5B-Instruct
56
+ Architecture: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
57
+ Parameters: 1.5B total (Adjust non-embedding count if you have it).
58
+ Layers: (Adjust if different from the 3B model)
59
+ Attention Heads: (Adjust if different from the 3B model)
60
+ Context Length: Up to 32,768 tokens.
61
+
62
+ Applications
63
+
64
+ Athena 1.5B is designed for a variety of real-world applications:
65
+
66
+ Conversational AI: Build fast, responsive, and lightweight chatbots.
67
+ Code Generation: Generate, debug, or explain code snippets.
68
+ Mathematical Problem Solving: Assist with calculations and reasoning.
69
+ Document Processing: Summarize and analyze moderately large documents.
70
+ Multilingual Applications: Support for global use cases with diverse language requirements.
71
+ Structured Data: Process and generate structured data, such as tables and JSON.
72
+
73
+ Quickstart
74
+
75
+ Here’s how you can use Athena 1.5B for quick text generation:
76
+
77
+
78
+ # Use a pipeline as a high-level helper
79
+ from transformers import pipeline
80
+
81
+ messages = [
82
+ {"role": "user", "content": "Who are you?"},
83
+ ]
84
+ pipe = pipeline("text-generation", model="Spestly/Athena-1-1.5B") # Update model name
85
+ print(pipe(messages))
86
+
87
+ # Load model directly
88
+ from transformers import AutoTokenizer, AutoModelForCausalLM
89
+
90
+ tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-1.5B") # Update model name
91
+ model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-1.5B") # Update model name
92
+
93
+ ---
94
  ## Use with llama.cpp
95
  Install llama.cpp through brew (works on Mac and Linux)
96