teleprint-me
/

llama-3.2-3b-instruct

Text Generation

Model card Files Files and versions Community

aberrio commited on Sep 30, 2024

Commit

c5f5bdd

·

verified ·

1 Parent(s): d26eb22

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+license: llama3.2
+license_link: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/blob/main/LICENSE.txt
+library: llama.cpp
+library_link: https://github.com/ggerganov/llama.cpp
+base_model:
+  - meta-llama/Llama-3.2-1B-Instruct
+language:
+  - en
+  - de
+  - fr
+  - it
+  - pt
+  - hi
+  - es
+  - th
+pipeline_tag: text-generation
+tags:
+  - nlp
+  - code
+  - gguf
+---
+## LLaMA 3.2 1B Instruct
+LLaMA 3.2 3B Instruct is a multilingual instruction-tuned language model with 3.21 billion parameters. Designed for diverse multilingual dialogue and summarization tasks, it offers effective performance on a range of NLP benchmarks.
+### Model Information
+- **Name**: LLaMA 3.2 3B Instruct
+- **Parameter Size**: 3B (3.21B)
+- **Model Family**: LLaMA 3.2
+- **Architecture**: Auto-regressive Transformer with Grouped-Query Attention (GQA)
+- **Purpose**: Multilingual dialogue generation, text generation, and summarization.
+- **Training Data**: A mix of publicly available multilingual data, covering up to 9T tokens.
+- **Supported Languages**: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
+- **Release Date**: September 25, 2024
+- **Context Length**: 128k tokens
+- **Knowledge Cutoff**: December 2023
+### Quantized Model Files
+- **Available Formats**:
+  - **ggml-model-q8_0.gguf**: 8-bit quantization for resource efficiency and good performance.
+  - **ggml-model-f16.gguf**: Half-precision (16-bit) floating-point format for enhanced precision.
+- **Quantization Library**: llama.cpp
+- **Use Cases**: Multilingual dialogue, summarization, and text generation.
+### Core Library
+LLaMA 3.2 1B Instruct can be deployed using `llama.cpp` or `transformers`, with a focus on streamlined integration into the Hugging Face ecosystem.
+- **Primary Framework**: `llama.cpp`
+- **Alternate Frameworks**:
+  - `transformers` for Hugging Face model support.
+  - `vLLM` for optimized inference and low-latency deployments.
+**Library and Model Links**:
+- **Model Base**: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
+- **Models**: [meta-llama/llama-stack](https://github.com/meta-llama/llama-stack)
+- **Inference Support**: [meta-llama/llama](https://github.com/meta-llama/llama)
+- **Quantization**: [ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
+### Safety and Responsible Use
+LLaMA 3.2 3B has been designed with safety in mind but may produce biased, harmful, or unpredictable outputs, especially for less-covered languages or specific prompts.
+- **Testing and Risk Assessment**: Initial testing has primarily focused on English; coverage for other languages is ongoing.
+- **Limitations**: LLaMA 3.2 may not fully adhere to user instructions or safety guidelines, and may exhibit unexpected behaviors.
+- **Responsible Use Guidelines**: Refer to the [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) for more details.