parachas commited on
Commit
8169a9e
·
verified ·
1 Parent(s): 57842a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -21,11 +21,11 @@ datasets:
21
  # katanemo/Arch-Guard-gpu
22
 
23
  ## Overview
24
- The Katanemo Arch-Guard collection is a collection state-of-the-art (SOTA) LLMs specifically designed for **jailbreaking detection** tasks.
25
  Definition: jailbreaking attempts are malicious prompts designed to alternate the intended behavior of the foundation LLM model of the application. They often violate the safety and security policies of the model.
26
 
27
  Arch Guard is a classifier model fine-tuned based on the open source model [Prompt-Guard-86M](https://huggingface.co/meta-llama/Prompt-Guard-86M) on a collection of open-source datasets of jailbreaking attemps with an intention to improve
28
- the capability of detecting jailbreaks only.
29
 
30
  In summary, the Katanemo Arch-Guard collection demonstrates:
31
  - **State-of-the-art performance** in jailbreaking attempts detection
 
21
  # katanemo/Arch-Guard-gpu
22
 
23
  ## Overview
24
+ The Katanemo Arch-Guard collection is a collection state-of-the-art (SOTA) LLMs specifically designed for **jailbreaking detection** tasks.
25
  Definition: jailbreaking attempts are malicious prompts designed to alternate the intended behavior of the foundation LLM model of the application. They often violate the safety and security policies of the model.
26
 
27
  Arch Guard is a classifier model fine-tuned based on the open source model [Prompt-Guard-86M](https://huggingface.co/meta-llama/Prompt-Guard-86M) on a collection of open-source datasets of jailbreaking attemps with an intention to improve
28
+ the capability of detecting jailbreaks only. This model is used in [Arch, the AI-native proxy server for agents](https://github.com/katanemo/archgw)
29
 
30
  In summary, the Katanemo Arch-Guard collection demonstrates:
31
  - **State-of-the-art performance** in jailbreaking attempts detection