The models of the paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability".
Xiaoya Lu
Ursulalala
·
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
16 days ago
SSRL: Self-Search Reinforcement Learning
upvoted
a
paper
about 1 month ago
Frontier AI Risk Management Framework in Practice: A Risk Analysis
Technical Report
upvoted
a
paper
about 1 month ago
The Devil behind the mask: An emergent safety vulnerability of Diffusion
LLMs