aberrio commited on
Commit
c5f5bdd
·
verified ·
1 Parent(s): d26eb22

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.2
3
+ license_link: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/blob/main/LICENSE.txt
4
+ library: llama.cpp
5
+ library_link: https://github.com/ggerganov/llama.cpp
6
+ base_model:
7
+ - meta-llama/Llama-3.2-1B-Instruct
8
+ language:
9
+ - en
10
+ - de
11
+ - fr
12
+ - it
13
+ - pt
14
+ - hi
15
+ - es
16
+ - th
17
+ pipeline_tag: text-generation
18
+ tags:
19
+ - nlp
20
+ - code
21
+ - gguf
22
+ ---
23
+
24
+ ## LLaMA 3.2 1B Instruct
25
+
26
+ LLaMA 3.2 3B Instruct is a multilingual instruction-tuned language model with 3.21 billion parameters. Designed for diverse multilingual dialogue and summarization tasks, it offers effective performance on a range of NLP benchmarks.
27
+
28
+ ### Model Information
29
+ - **Name**: LLaMA 3.2 3B Instruct
30
+ - **Parameter Size**: 3B (3.21B)
31
+ - **Model Family**: LLaMA 3.2
32
+ - **Architecture**: Auto-regressive Transformer with Grouped-Query Attention (GQA)
33
+ - **Purpose**: Multilingual dialogue generation, text generation, and summarization.
34
+ - **Training Data**: A mix of publicly available multilingual data, covering up to 9T tokens.
35
+ - **Supported Languages**: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
36
+ - **Release Date**: September 25, 2024
37
+ - **Context Length**: 128k tokens
38
+ - **Knowledge Cutoff**: December 2023
39
+
40
+ ### Quantized Model Files
41
+ - **Available Formats**:
42
+ - **ggml-model-q8_0.gguf**: 8-bit quantization for resource efficiency and good performance.
43
+ - **ggml-model-f16.gguf**: Half-precision (16-bit) floating-point format for enhanced precision.
44
+ - **Quantization Library**: llama.cpp
45
+ - **Use Cases**: Multilingual dialogue, summarization, and text generation.
46
+
47
+ ### Core Library
48
+ LLaMA 3.2 1B Instruct can be deployed using `llama.cpp` or `transformers`, with a focus on streamlined integration into the Hugging Face ecosystem.
49
+
50
+ - **Primary Framework**: `llama.cpp`
51
+ - **Alternate Frameworks**:
52
+ - `transformers` for Hugging Face model support.
53
+ - `vLLM` for optimized inference and low-latency deployments.
54
+
55
+ **Library and Model Links**:
56
+ - **Model Base**: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
57
+ - **Models**: [meta-llama/llama-stack](https://github.com/meta-llama/llama-stack)
58
+ - **Inference Support**: [meta-llama/llama](https://github.com/meta-llama/llama)
59
+ - **Quantization**: [ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
60
+
61
+ ### Safety and Responsible Use
62
+ LLaMA 3.2 3B has been designed with safety in mind but may produce biased, harmful, or unpredictable outputs, especially for less-covered languages or specific prompts.
63
+
64
+ - **Testing and Risk Assessment**: Initial testing has primarily focused on English; coverage for other languages is ongoing.
65
+ - **Limitations**: LLaMA 3.2 may not fully adhere to user instructions or safety guidelines, and may exhibit unexpected behaviors.
66
+ - **Responsible Use Guidelines**: Refer to the [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) for more details.