Cognitive Computations

community

https://erichartford.com

erhartford

ehartford

Activity Feed

AI & ML interests

Supervised Fine Tuning, DPO, and unalignment

Recent Activity

v2ray new activity about 2 hours ago

cognitivecomputations/DeepSeek-R1-AWQ:Why hasn't the MTP layer of the 61st layer been quantized?

v2ray new activity about 8 hours ago

cognitivecomputations/DeepSeek-R1-AWQ:Is there any testing on the support for running on other memory capacities

v2ray new activity about 8 hours ago

cognitivecomputations/DeepSeek-R1-AWQ:Are there any updates to the recommended commands?

View all activity

cognitivecomputations's activity

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ about 2 hours ago

Why hasn't the MTP layer of the 61st layer been quantized?

#30 opened about 5 hours ago by

yang001002

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ about 8 hours ago

Is there any testing on the support for running on other memory capacities

#29 opened about 9 hours ago by

HRan2004

Are there any updates to the recommended commands?

#27 opened 6 days ago by

NaiveYan

louisbrulenaudet

posted an update 2 days ago

Post

664

I’ve just released logfire-callback on PyPI, designed to facilitate monitoring of Hugging Face Transformer training loops using Pydantic Logfire 🤗

The callback will automatically log training start with configuration parameters, periodic metrics and training completion ⏱️

Install the package using pip:

pip install logfire-callback

First, ensure you have a Logfire API token and set it as an environment variable:

export LOGFIRE_TOKEN=your_logfire_token

Then use the callback in your training code:

from transformers import Trainer, TrainingArguments
from logfire_callback import LogfireCallback

# Initialize your model, dataset, etc.

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    # ... other training arguments
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    callbacks=[LogfireCallback()]  # Add the Logfire callback here
)

trainer.train()

If you have any feedback, please reach out at @louisbrulenaudet

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 4 days ago

Any one can run this model with SGlang framework？

#13 opened about 1 month ago by

muziyongshixin

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 5 days ago

DeepSeek-R1-AWQ quantized model missing one layer of experts

#28 opened 5 days ago by

virilo

isemmanuelolowe

authored 2 papers 6 days ago

TMIQ: Quantifying Test and Measurement Domain Intelligence in Large Language Models

Paper • 2503.02123 • Published 22 days ago

LABIIUM: AI-Enhanced Zero-configuration Measurement Automation System

Paper • 2412.16172 • Published Dec 7, 2024

AtAndDev

posted an update 9 days ago

Post

4098

There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...

6 replies

AtAndDev

posted an update 13 days ago

Post

1546

Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 13 days ago

About the group size

#26 opened 13 days ago by

Skyeaee

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 19 days ago

The awq quantization model may encounter garbled characters when performing inference on long texts.

#24 opened 21 days ago by

wx111

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 20 days ago

How can I quantify my BF16 format model into AWQ?

#25 opened 20 days ago by

AlipaySimon

ajibawa-2023

in cognitivecomputations/Code-290k-ShareGPT-Vicuna 22 days ago

Data generation process and LLM used

#2 opened 23 days ago by

Chintan-Shah

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 22 days ago

Support for inference with MTP module?

#23 opened 22 days ago by

yhh001

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 25 days ago

poor performance for DeepSeek-V3-AWQ

#9 opened 27 days ago by

fridayl

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 26 days ago

The V3-AWQ model's response seems not as expected

#8 opened 29 days ago by

juxing

Locutusque

posted an update 28 days ago

Post

2637

🎉 Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! ✨️

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! 🧠

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! ✍️