Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
2
ronnie robinson
PRO
atorsvn
Follow
Frostsongr's profile picture
1 follower
ยท
1 following
AI & ML interests
AI for education
Recent Activity
liked
a Space
9 days ago
balacoon/tts
replied
to
hexgrad
's
post
23 days ago
Technical question: Is Abliteration still an effective method for uncensoring LLMs? Generally, what are the most effective methods to uncensor LLMs? An effective uncensoring method would ideally be low-cost, data-efficient, and above all, successfully uncensor an LLM with minimal benchmark regressions. "Tiananmen Square", "Winnie-the-Pooh", etc and more broadly "China influence/censorship" are some common criticisms leveled at DeepSeek. I am vaguely aware of "Abliteration", a technique coined by @failspy (apologies if that attribution is incorrect) and originally described in a mid-2024 paper titled "Refusal in Language Models Is Mediated by a Single Direction" https://arxiv.org/abs/2406.11717 Abliteration is proposed as a relatively cheap and effective way to bypass censorship in models. However, it is not without criticism: https://www.reddit.com/r/LocalLLaMA/comments/1f07b4b/abliteration_fails_to_uncensor_models_while_it/ Curious to hear people's takes on Abliteration or other uncensoring methods, especially as it relates to DeepSeek.
new
activity
10 months ago
Crataco/distilgpt2-82M-GGUF:
How did you manage this conversion?
View all activity
Organizations
None yet
atorsvn
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a Space
9 days ago
Running
127
127
Text-to-Speech
๐ฌ
Convert text to speech with customizable models and speakers
liked
a model
over 1 year ago
TinyLlama/TinyLlama-1.1B-step-50K-105b
Text Generation
โข
Updated
Sep 16, 2023
โข
21k
โข
โข
132