Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
33
Adzuki
melt-adzuki
Follow
0 followers
·
3 following
melt-adzuki
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 1 month ago
open-thoughts/OpenThoughts-114k
liked
a Space
about 1 month ago
autotrain-projects/autotrain-advanced
reacted
to
Elizezen
's
post
with 🤯
about 1 month ago
It turned out that the following simple method seems to be actually effective when you want to increase the appearance probability of only one or a very limited number of tokens. ``` import os one_token = "♡" # Token to increase the appearance probability value = 1000000 token = one_token * value with open("one-token.txt", "w", encoding="utf-8") as f: f.write(token) ``` By training LoRA with unsloth based on the .txt file generated by the code above, you can increase the appearance probability of specific tokens while maintaining the model's performance to great extent. However, it's better to stop the training before train loss becomes 0.0, as it will start spamming the token once it appears even once. In general, you can stop training at a very early stage and it will still work. It is also possible to reduce the appearance probability of specific tokens by creating an over-learned LoRA with the specific tokens you want to reduce, combining it with the model, and then creating a model that extracts only the difference using the chat vector method and subtracting it from an arbitrary model. In this case, it is better to set the ratio of chat vector to about five times. It has very little effect on the overall performance, apart from the specific tokens. ``` new_v = v - (5.0 * chat_vector[i].to(v.device)) ```
View all activity
Organizations
models
3
Sort:Â Recently updated
melt-adzuki/DeepSeek-R1-Distill-Qwen-14B-Japanese-Q8-mlx
Text Generation
•
Updated
Jan 30
•
27
melt-adzuki/DeepSeek-R1-Distill-Qwen-14B-Japanese-Q4-mlx
Text Generation
•
Updated
Jan 30
•
30
melt-adzuki/DeepSeek-R1-Distill-Qwen-14B-Japanese-Q6-mlx
Text Generation
•
Updated
Jan 30
•
22
datasets
None public yet