README.md · Spestly/Atlas-Pro-7B-Preview-1M-GGUF at main

metadata

base_model: Spestly/Atlas-Pro-7B-Preview-1M
datasets:
  - prithivMLmods/PyCodeZone
  - bespokelabs/Bespoke-Stratos-17k
  - openai/gsm8k
  - rubenroy/GammaCorpus-v1-50k-UNFILTERED
language:
  - en
  - zh
  - fr
  - es
  - pt
  - de
  - it
  - ru
  - ja
  - ko
  - vi
  - th
  - ar
  - fa
  - he
  - tr
  - cs
  - pl
  - hi
  - bn
  - ur
  - id
  - ms
  - lo
  - my
  - ceb
  - km
  - tl
  - nl
library_name: transformers
license: mit
quantized_by: mradermacher
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2
  - trl
extra_gated_prompt: >-
  By accessing this model, you agree to comply with ethical usage guidelines and
  accept full responsibility for its applications. You will not use this model
  for harmful, malicious, or illegal activities, and you understand that the
  model's use is subject to ongoing monitoring for misuse. This model is
  provided 'AS IS' and agreeing to this means that you are responsible for all
  the outputs generated by you
extra_gated_fields:
  Name: text
  Organization: text
  Country: country
  Date of Birth: date_picker
  Intended Use:
    type: select
    options:
      - Research
      - Education
      - Personal Development
      - Commercial Use
      - label: Other
        value: other
  I agree to use this model in accordance with all applicable laws and ethical guidelines: checkbox
  I agree to use this model under the MIT licence: checkbox

Spestly/Atlas-Pro-7B-Preview-1M

Model Overview

Atlas-Pro-7B-Preview-1M is a fine-tuned version of the Qwen2.5-7B-Instruct-1M model, tailored for superior performance in general-purpose question answering and reasoning tasks. This model focuses on delivering clear, concise answers while maintaining a natural, conversational tone. By incorporating subtle grammatical imperfections, it creates a more relatable and human-like interaction style.

Key Features:

Enhanced Reasoning Capabilities: Fine-tuning has improved the model's ability to handle reasoning-focused questions with better accuracy and depth.
Humanized Interaction: Subtle grammar imperfections are included intentionally to emulate a more human-like conversational experience.
Improved QA Performance: Extensive training has refined the model's ability to respond to questions accurately and contextually.

Model Details

Base Model: Qwen/Qwen2.5-7B-Instruct-1M
Fine-Tuned Dataset: A carefully curated mix of instructional and conversational data, designed to improve reasoning and question-answering performance.
Parameter Count: 7 billion (7B)
Architecture: Transformer-based, leveraging the Qwen2.5 architecture for high efficiency and accuracy.
Context Window: 1 Million Tokens

Training Procedure

The model was fine-tuned using the following strategies:

Dataset Quality: A diverse dataset was selected (Public and Private), focusing on improving reasoning and conversational understanding.
Humanization: Data augmentation techniques were employed to add slight grammar imperfections, mimicking human language patterns.
Optimization: Training was conducted using mixed-precision techniques to ensure efficiency without compromising performance.

Limitations

While the model excels in reasoning and answering questions, it:

May produce occasional inaccuracies if provided with ambiguous or incomplete queries.
Does not specialize in niche technical domains or highly specific knowledge areas outside its training data.
Subtle grammatical errors are intentional and may occasionally appear in unintended contexts.

Usage

The model can be used for:

Interactive chatbots with a humanized tone.
General-purpose reasoning and question-answering tasks.
Personal assistant tools designed for natural communication.

Example Usage

Basic: Ollama + LM Studio

I recommend that you use LM Studio. Later down the Atlas development, Alternative you can also run it via Ollama with this Ollama command:

ollama run hf.co/Spestly/Atlas-Pro-7B-Preview-1M-GGUF:IQ4_XS

Remember to replace the tag at the end with the Quant you want to use

Advanced: TGI (Text Generation Interface):

WARNING!: Only use this method if you have experience using TGI.

First you need to start the TGI server via this command (Make sure you have docker installed):

# Deploy with docker on Linux:
docker run --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
     -e HF_TOKEN="<secret>" \
    -p 8000:80 \
    ghcr.io/huggingface/text-generation-inference:latest \
    --model-id Spestly/Atlas-Pro-7B-Preview-1M-GGUF

You now call the server you just deployed!

# Call the server using curl:
curl -X POST "http://localhost:8000/v1/chat/completions" \
    -H "Content-Type: application/json" \
    --data '{
        "model": "Spestly/Atlas-Pro-7B-Preview-1M-GGUF",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ]
    }'

Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Link	Type	Size/GB	Notes
GGUF	Q2_K	3.1
GGUF	Q3_K_S	3.6
GGUF	Q3_K_M	3.9	lower quality
GGUF	Q3_K_L	4.2
GGUF	IQ4_XS	4.4
GGUF	Q4_K_S	4.6	fast, recommended
GGUF	Q4_K_M	4.8	fast, recommended
GGUF	Q5_K_S	5.4
GGUF	Q5_K_M	5.5
GGUF	Q6_K	6.4	very good quality
GGUF	Q8_0	8.2	fast, best quality
GGUF	f16	15.3	16 bpw, overkill

Community

We encourage feedback and contributions from the community. Please report any issues or suggest improvements via the model’s Hugging Face page.

License: MIT

Contact: For questions or collaboration opportunities, please reach out via Hugging Face.