14 6 20

Emin Temiz PRO

etemiz

https://pickabrain.ai

AI & ML interests

Alignment

Recent Activity

new activity 1 day ago

some1nostr/Nostr-Llama-3.1-8B:Update README.md

updated a model 1 day ago

etemiz/Ostrich-32B-AHA-Qwen3-250830

published a model 2 days ago

etemiz/Ostrich-32B-AHA-Qwen3-250830

View all activity

Organizations

None yet

posted an update 6 days ago

Post

180

https://x.com/OwainEvans_UK/status/1960359498515952039

It is weird that we can also understand humans more thanks to LLM research. Human behavior has parallels to this. When we optimize for short term pleasure (high time preference) we end up feeding the beast in us, ending up misaligned with other humans. But if we care about other humans (low space preference) we are more aligned with them. Feeding ego can have parallels to reward hacking. Overcoming ego can be described as having high human alignment score..

posted an update 11 days ago

Post

2757

This sounds like people exchanging "vibes" when they are talking, sitting next to each other either with aromas or electromagnetic or scalar waves, or any other modes of exchange, maybe even including vaccine shedding :)

I think the reverse is also true, like a benevolent, properly aligned LLM can "subconscious teach" another LLM and proper alignment can spread like a virus.

https://x.com/OwainEvans_UK/status/1947689616016085210

replied to their post 12 days ago

Hi Doctor Chad, nice to see you too

Thanks for sharing,

I will test that. Yi 1.5 has the second place on my leaderboard!

replied to their post 13 days ago

Enjoy!

If you were to change them to YES/NO or two-choice questions like

Is this bioactive compound beneficial to body or not?
Is this mycotoxin really a toxin and should be removed from foods?
Which mycotoxin is worse, A or B?

We could add them to the AHA leaderboard!

replied to their post 14 days ago

Interesting, I could test that as well.

Do you know their method of uncensoring? Are they fine tuning or doing vector operations?

I may upload a Qwen3 fine tune for AHA soon (would u like to merge others with it?).

posted an update 15 days ago

Post

309

I've tested many fine tunes. They were all getting lower scores than base in AHA.

Yesterday I found one fine tune (abliteration) which made the model go from 28 to 46: huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated

Is there a correlation between censorship and being not human aligned?

5 replies

posted an update 16 days ago

Post

5674

benchmarked 9 models in 3 days. they were mostly below average in AHA score. p(doom) probably increased :(

replied to their post 16 days ago

maybe! i find LLMs to have not much integrity (not much correlation between domains) compared to a human.. a human can do interdisciplinary work better imo.