ronnie robinson's picture

1 2

ronnie robinson PRO

atorsvn

·

AI & ML interests

AI for education

Recent Activity

liked a Space 9 days ago

balacoon/tts

replied to hexgrad's post 23 days ago

Technical question: Is Abliteration still an effective method for uncensoring LLMs? Generally, what are the most effective methods to uncensor LLMs? An effective uncensoring method would ideally be low-cost, data-efficient, and above all, successfully uncensor an LLM with minimal benchmark regressions. "Tiananmen Square", "Winnie-the-Pooh", etc and more broadly "China influence/censorship" are some common criticisms leveled at DeepSeek. I am vaguely aware of "Abliteration", a technique coined by @failspy (apologies if that attribution is incorrect) and originally described in a mid-2024 paper titled "Refusal in Language Models Is Mediated by a Single Direction" https://arxiv.org/abs/2406.11717 Abliteration is proposed as a relatively cheap and effective way to bypass censorship in models. However, it is not without criticism: https://www.reddit.com/r/LocalLLaMA/comments/1f07b4b/abliteration_fails_to_uncensor_models_while_it/ Curious to hear people's takes on Abliteration or other uncensoring methods, especially as it relates to DeepSeek.

new activity 10 months ago

Crataco/distilgpt2-82M-GGUF:How did you manage this conversion?

View all activity

Organizations

None yet

atorsvn's activity

liked a Space 9 days ago

Text-to-Speech

Convert text to speech with customizable models and speakers

liked a model over 1 year ago

TinyLlama/TinyLlama-1.1B-step-50K-105b

Text Generation • Updated Sep 16, 2023 • 21k • • 132