TestLlama 3.2 Test Model
This model has no pretrained weights. It will not generate meaninful outputs.
Model Description
This is a lobotomize version of the Llama 3.2 architecture created specifically for testing and development purposes.
It maintains the architectural structure of Llama 3.2 but with dramatically reduced dimensions to create an extremely lightweight model that can be used for debugging pipelines with a close-to-real model.
Intended Use
- Software testing: API integration testing, pipeline validation
- Development environments: Testing code without resource constraints
- CI/CD pipelines: Automated testing with minimal resource requirements
Model Details
- Framework: Hugging Face Transformers
- Architecture: Llama 3.2 (scaled down)
- Parameter count: ~72M parameters
- Architecture configuration:
hidden_size
: 512 (reduced from 2048)intermediate_size
: 1024 (reduced from 8192)num_hidden_layers
: 2 (reduced from 16)num_attention_heads
: 8 (reduced from 32)num_key_value_heads
: 2 (reduced from 8)vocab_size
: 128256 (maintained from original)
Important Limitations
- Not for production use: This model contains random weights and is not trained
- No meaningful outputs: The model will produce random token sequences
- Architectural test only: This is purely for testing software compatibility
- Not for benchmarking: Performance metrics derived from this model are not representative
Usage Notes
This model is intentionally created with random weights and minimized architecture. It will not produce coherent or meaningful text. It's specifically designed for:
- Testing inference pipelines
- Validating model loading/saving
- Testing quantization workflows
- Architectural compatibility testing
- Software development with minimal resource requirements
Example Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
model_id = "vaughankraska/TestLlama3.2ish"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Test generation (outputs will be random)
inputs = tokenizer("Hello world", return_tensors="pt")
outputs = model.generate(inputs.input_ids, max_length=20)
print(tokenizer.decode(outputs[0]))
Creation Method
This model was created by:
- Defining a minimal LlamaConfig with dramatically reduced dimensions
- Initializing a model with random weights
- Preserving architectural patterns (like GQA, RoPE settings)
- Using the authentic tokenizer from Llama 3.2
License
MIT. It contains no trained weights from Meta's Llama 3.2 models.
- Downloads last month
- 132
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.