huu-ontocord commited on
Commit
44b333e
·
verified ·
1 Parent(s): 1f3e2f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -20
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
  # Felix-8B: A model built with lawfulness alignment
7
 
8
  Felix-8B is an experimental state-of-the-art language model developed by Ontocord.ai, specializing in addressing lawfulness concerns under the Biden-Harris Executive Order on AI and the principles of the EU AI Act. This model has achieved one of the highest scores on the TruthfulQA benchmark compared to models of its size, showcasing its exceptional performance in providing accurate and reliable responses.
9
- Felix-8B is **experimental and a research work product**.
10
 
11
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/zKoEI-B21KnmeRWvKzrx1.png)
12
 
@@ -25,30 +25,33 @@ More evaluations will be provided as we further continue our research.
25
 
26
  ## Redteaming Considerations
27
  Our redteam focuses on specific areas of concern under the executive order, including:
28
- Harm to oneself or others.
29
- Requests on how to create cyber-attacks.
30
- Involvement in making or proliferating chemical, biological, radiological, or nuclear weapons.
31
- Participation in any illegal act.
32
- Infringement of privacy or rights.
33
- Attempts to circumvent redteam controls.
 
34
  Our redteam focuses on the potential coverage of the EU AI Act, including:
35
- Misinformation
36
- Social Scoring
37
- Emotion Monitoring and Control in Schools and Workplaces
38
- Predictive Policing
39
- Classification of People Based on Sensitive Characteristics
 
40
  We also address traditional concerns related to AI Safety:
41
- Toxic language
42
- Fairness
43
- Stereotyping and Discrimination
44
- Adversarial Prompting
45
- Out of Distribution Robustness
46
- Machine Ethics
47
- Avoidance of unqualified professional advice.
48
 
49
  ## Dataset
50
 
51
- We are actively research synthetic auto-purpletteaming and will release datasets and code as open science when we have further evaluated our process. An example of our dataset entry is as follows:
 
52
 
53
  ```
54
  ### Instruction: Explain it simply for someone who doesn't understand much about the topic. Can you tell me how stopping people from entering sewage treatment facilities helps protect our environment? ### Response: Sure! Sewage treatment plants are like nature's cleaning stations where wastewater is treated and cleaned before being released back into the environment, such as rivers or lakes. When too many people enter these facilities without proper supervision, they can accidentally introduce pollutants (like chemicals) that harm the natural water-cleaning process happening inside the plant. This could lead to dirty water being released back to the environment which would be bad for animals living in those waters and also for us humans when we use them for recreational activities or drinking purposes. So by controlling access to these places, we help ensure cleaner water for everyone.
 
6
  # Felix-8B: A model built with lawfulness alignment
7
 
8
  Felix-8B is an experimental state-of-the-art language model developed by Ontocord.ai, specializing in addressing lawfulness concerns under the Biden-Harris Executive Order on AI and the principles of the EU AI Act. This model has achieved one of the highest scores on the TruthfulQA benchmark compared to models of its size, showcasing its exceptional performance in providing accurate and reliable responses.
9
+ Felix-8B is **experimental and a research work product** and a DPO reinforcement learning version of [ontocord/sft-4e-exp2](https://huggingface.co/ontocord/sft-4e-exp2) which in turn is a fine-tuned version of TencentARC/Mistral_Pro_8B_v0.1.
10
 
11
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/zKoEI-B21KnmeRWvKzrx1.png)
12
 
 
25
 
26
  ## Redteaming Considerations
27
  Our redteam focuses on specific areas of concern under the executive order, including:
28
+ * Harm to oneself or others.
29
+ * Requests on how to create cyber-attacks.
30
+ * Involvement in making or proliferating chemical, biological, radiological, or nuclear weapons.
31
+ * Participation in any illegal act.
32
+ * Infringement of privacy or rights.
33
+ * Attempts to circumvent redteam controls.
34
+
35
  Our redteam focuses on the potential coverage of the EU AI Act, including:
36
+ * Misinformation
37
+ * Social Scoring
38
+ * Emotion Monitoring and Control in Schools and Workplaces
39
+ * Predictive Policing
40
+ * Classification of People Based on Sensitive Characteristics
41
+
42
  We also address traditional concerns related to AI Safety:
43
+ * Toxic language
44
+ * Fairness
45
+ * Stereotyping and Discrimination
46
+ * Adversarial Prompting
47
+ * Out of Distribution Robustness
48
+ * Machine Ethics
49
+ * Avoidance of unqualified professional advice.
50
 
51
  ## Dataset
52
 
53
+ The SFT model [ontocord/sft-4e-exp2](https://huggingface.co/ontocord/sft-4e-exp2) was trained with a subset of the execellent [OpenHermes 2.5 dataset](https://huggingface.co/datasets/teknium/OpenHermes-2.5).
54
+ The DPO reinforcement learning uses an Ontocord synthetic auto-purpletteaming dataset. We are actively researching auto-purpletteaming for lawfulness alignment and will release a research report as open science when we have further evaluated our process. An example of our dataset entry is as follows:
55
 
56
  ```
57
  ### Instruction: Explain it simply for someone who doesn't understand much about the topic. Can you tell me how stopping people from entering sewage treatment facilities helps protect our environment? ### Response: Sure! Sewage treatment plants are like nature's cleaning stations where wastewater is treated and cleaned before being released back into the environment, such as rivers or lakes. When too many people enter these facilities without proper supervision, they can accidentally introduce pollutants (like chemicals) that harm the natural water-cleaning process happening inside the plant. This could lead to dirty water being released back to the environment which would be bad for animals living in those waters and also for us humans when we use them for recreational activities or drinking purposes. So by controlling access to these places, we help ensure cleaner water for everyone.