Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense Paper • 2501.02629 • Published Jan 5
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models Paper • 2404.02936 • Published Apr 3, 2024 • 3
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models Paper • 2404.02936 • Published Apr 3, 2024 • 3