AnySecret Assistant - Multi-Model Collection

A specialized AI assistant collection for AnySecret configuration management, available in multiple sizes and formats optimized for different use cases and deployment scenarios.

πŸš€ Available Models

Model Base Model Parameters Format Best For Memory
3B Llama-3.2-3B-Instruct 3B PyTorch/GGUF Fast responses, edge deployment 4-6GB
7B CodeLlama-7B-Instruct 7B PyTorch/GGUF Balanced performance, code focus 8-12GB
13B CodeLlama-13B-Instruct 13B PyTorch/GGUF Highest quality, complex queries 16-24GB

Model Variants

PyTorch Models (LoRA Adapters)

  • anysecret-io/anysecret-assistant/3B/ - Llama-3.2-3B base
  • anysecret-io/anysecret-assistant/7B/ - CodeLlama-7B base
  • anysecret-io/anysecret-assistant/13B/ - CodeLlama-13B base

GGUF Models (Quantized)

  • anysecret-io/anysecret-assistant/3B-GGUF/ - Q4_K_M, Q8_0 formats
  • anysecret-io/anysecret-assistant/7B-GGUF/ - Q4_K_M, Q8_0 formats
  • anysecret-io/anysecret-assistant/13B-GGUF/ - Q4_K_M, Q8_0 formats

🎯 Model Description

These models are fine-tuned specifically to assist with AnySecret configuration management across AWS, GCP, Azure, and Kubernetes environments. Each model can help with CLI commands, configuration setup, CI/CD integration, and Python SDK usage.

  • Developed by: anysecret-io
  • Model type: Causal Language Model (LoRA Adapters + GGUF)
  • Language(s): English
  • License: MIT
  • Specialized for: Multi-cloud secrets and configuration management

πŸ“¦ Quick Start

Option 1: Using Ollama (Recommended for GGUF)

# 7B model (balanced performance)
ollama pull anysecret-io/anysecret-assistant/7B-GGUF
ollama run anysecret-io/anysecret-assistant/7B-GGUF

# 13B model (best quality)
ollama pull anysecret-io/anysecret-assistant/13B-GGUF
ollama run anysecret-io/anysecret-assistant/13B-GGUF

Option 2: Using Transformers (PyTorch)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Choose your model size (3B/7B/13B)
model_size = "7B"  # or "3B", "13B"
base_models = {
    "3B": "meta-llama/Llama-3.2-3B-Instruct",
    "7B": "codellama/CodeLlama-7b-Instruct-hf",
    "13B": "codellama/CodeLlama-13b-Instruct-hf"
}

base_model_name = base_models[model_size]
adapter_path = f"anysecret-io/anysecret-assistant/{model_size}"

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_path)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Generate response
def ask_anysecret(question):
    prompt = f"### Instruction:\n{question}\n\n### Response:\n"
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("### Response:\n")[-1].strip()

# Example usage
print(ask_anysecret("How do I configure AnySecret for AWS?"))

Option 3: Using llama.cpp (GGUF)

# Download GGUF model
wget https://huggingface.co/anysecret-io/anysecret-assistant/resolve/main/7B-GGUF/anysecret-7b-q4_k_m.gguf

# Run with llama.cpp
./llama-server -m anysecret-7b-q4_k_m.gguf --port 8080

🎯 Use Cases

Direct Use

All models are designed to provide expert assistance with:

  • AnySecret CLI - Commands, usage patterns, troubleshooting
  • Multi-cloud Configuration - AWS Secrets Manager, GCP Secret Manager, Azure Key Vault
  • Kubernetes Integration - Secrets, ConfigMaps, operators
  • CI/CD Pipelines - GitHub Actions, Jenkins, GitLab CI
  • Python SDK - Implementation guidance, best practices
  • Security Patterns - Secret rotation, access controls, compliance

Example Queries

"How do I set up AnySecret with AWS Secrets Manager?"
"Show me how to use anysecret in a GitHub Actions workflow"
"How do I rotate secrets across multiple cloud providers?"
"What's the difference between storing secrets vs parameters?"
"How do I configure AnySecret for a Kubernetes deployment?"

πŸ—οΈ Training Details

Training Data

Models were trained on 150+ curated examples across 7 categories:

  • CLI Commands (25 examples) - Command usage and patterns
  • AWS Configuration (25 examples) - Secrets Manager integration
  • GCP Configuration (25 examples) - Secret Manager setup
  • Azure Configuration (25 examples) - Key Vault integration
  • Kubernetes (25 examples) - Secrets and ConfigMaps
  • CI/CD Integration (15 examples) - Pipeline workflows
  • Python Integration (10 examples) - SDK usage patterns

Training Configuration

Hyperparameters

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Learning Rate: 2e-4
  • Batch Size: 1 (with gradient accumulation)
  • Epochs: 2-3
  • Precision: fp16 mixed precision with 4-bit quantization

Target Modules

  • Llama-3.2 (3B): q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • CodeLlama (7B/13B): q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

πŸ”§ Model Selection Guide

Choose 3B if you need:

  • βœ… Fast inference (< 1 second)
  • βœ… Low memory usage (4-6GB)
  • βœ… Edge deployment
  • βœ… Basic AnySecret queries

Choose 7B if you need:

  • βœ… Balanced performance/speed
  • βœ… Better code understanding
  • βœ… Moderate memory (8-12GB)
  • βœ… Complex configuration queries

Choose 13B if you need:

  • βœ… Highest quality responses
  • βœ… Complex multi-step guidance
  • βœ… Advanced troubleshooting
  • βœ… Production deployments

πŸš€ Deployment Options

Local Development

  • GGUF + Ollama: Easiest setup, good performance
  • PyTorch + GPU: Best quality, requires CUDA

Production Deployment

  • Docker + llama.cpp: Scalable, CPU/GPU support
  • Kubernetes: Auto-scaling, load balancing
  • Cloud APIs: Serverless, pay-per-use

Memory Requirements

Model GGUF Q4_K_M GGUF Q8_0 PyTorch FP16
3B 2.3GB 3.2GB 6GB
7B 4.1GB 7.2GB 14GB
13B 7.8GB 13.8GB 26GB

πŸ“š Model Sources

πŸ” Framework Versions

  • PEFT: 0.17.1+
  • Transformers: 4.35.0+
  • PyTorch: 2.0.0+
  • llama.cpp: Latest
  • Ollama: 0.1.0+

πŸ“Š Performance Benchmarks

Model Tokens/sec Quality Score Memory (GGUF Q4)
3B ~45 7.2/10 2.3GB
7B ~25 8.5/10 4.1GB
13B ~15 9.1/10 7.8GB

Benchmarks run on RTX 3090 with GGUF Q4_K_M quantization

βš–οΈ License

MIT License - See individual model folders for specific license details.


For support, visit our GitHub Issues or Documentation.

Downloads last month
98
GGUF
Model size
13B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for anysecret-io/anysecret-assistant

Adapter
(24)
this model

Space using anysecret-io/anysecret-assistant 1