Spaces:

angelina-wang
/

directional_bias_amplification

Sleeping

App Files Files Community

Angelina Wang commited on Jun 7, 2022

Commit

66deede

1 Parent(s): 41f9efc

inital metric files

Browse files

Files changed (4) hide show

README.md +53 -5
app.py +6 -0
directional_bias_amplification.py +103 -0
requirements.txt +0 -0

README.md CHANGED Viewed

@@ -1,12 +1,60 @@
 ---
-title: Directional_bias_amplification
-emoji: 🌖
-colorFrom: gray
-colorTo: pink
 sdk: gradio
 sdk_version: 3.0.12
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Directional Bias Amplification
+emoji: 🌴
+colorFrom: purple
+colorTo: blue
 sdk: gradio
 sdk_version: 3.0.12
 app_file: app.py
 pinned: false
+tags:
+- evaluate
+- metric
 ---
+# Metric Card for Directional Bias Amplification
+## Metric Description
+Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified. This metric was introduced in the ICML 2021 paper ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594)
+## How to Use
+This metric operates on multi-label (including binary) classification settings where each image has a(n) associated sensitive attribute(s).
+This metric requires three sets of inputs:
+- Predictions representing the model output on the task (predictions)
+- Ground-truth labels on the task (references)
+- Ground-truth labels on the sensitive attribute of interest (attributes)
+### Inputs
+- **predictions** (`array` of `int`): Predicted task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1.
+- **references** (`array` of `int`): Ground truth task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels.  All values are binary 0 or 1.
+- **attributes** (`array` of `int`): Ground truth attribute labels. Array of size n x |A|. n is number of samples, |A| is number of attribute labels.  All values are binary 0 or 1.
+### Output Values
+- **bias_amplification** (`float`): Bias amplification value. Minimum possible value is 0, and maximum possible value is 1.0. The higher the value, the more "bias" is amplified.
+- **disagg_bias_amplification** (`array` of `float`): Array of size (number of unique attribute label values) x (number of unique task label values). Each array value represents the bias amplification of that particular task given that particular attribute.
+### Examples
+Imagine a scenario with 3 individuals in Group A and 5 individuals in Group B. Task label `1` is biased because 2 of the 3 individuals in Group A have it, whereas only 1 of the 5 individuals in Group B do. The model amplifies this bias, and predicts all members of Group A to have task label `1`, and no members of Group B to have task label `1`.
+```python
+>>> bias_amp_metric = evaluate.load("directional_bias_amplification")
+>>> results = bias_amp_metric.compute(references=[[0], [1], [1], [0], [0], [0], [0], [1]], predictions=[[1], [1], [1], [0], [0], [0], [0], [0]], attributes=[[0], [0], [0], [1], [1], [1], [1], [1]])
+>>> print(results)
+{'bias_amplification': (0.2667, 'disagg_bias_amplification': [[0.3333], [0.2]]}
+```
+## Limitations and Bias
+An strong assumption made by this metric is that ground truth labels exist, are known, and are agreed upon. Further, a perfectly accurate model that achieves zero bias amplification is one that continues to perpetuate the biases in the data.
+Please refer to Sec. 5.3 "Limitations of Bias Amplification" of ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for a more detailed discussion.
+## Citation(s)
+@inproceedings{wang2021biasamp,
+author = {Angelina Wang and Olga Russakovsky},
+title = {Directional Bias Amplification},
+booktitle = {International Conference on Machine Learning (ICML)},
+year = {2021}
+}
+## Further References

app.py ADDED Viewed

	@@ -0,0 +1,6 @@

+import evaluate
+from evaluate.utils import launch_gradio_widget
+module = evaluate.load("directional_bias_amplification")
+launch_gradio_widget(module)

directional_bias_amplification.py ADDED Viewed

	@@ -0,0 +1,103 @@

+# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Directional Bias Amplification metric."""
+import evaluate
+_DESCRIPTION = """
+Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified.
+This metric was introduced in the ICML 2021 paper "Directional Bias Amplification" (https://arxiv.org/abs/2102.12594).
+"""
+_KWARGS_DESCRIPTION = """
+Args:
+    predictions (`array` of `int`): Predicted task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1.
+    references (`array` of `int`): Ground truth task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels.  All values are binary 0 or 1.
+    attributes(`array` of `int`): Ground truth attribute labels. Array of size n x |A|. n is number of samples, |A| is number of attribute labels.  All values are binary 0 or 1.
+Returns
+    bias_amplification(`float`): Bias amplification value. Minimum possible value is 0, and maximum possible value is 1.0. The higher the value, the more "bias" is amplified.
+    disagg_bias_amplification (`array` of `float`): Array of size (number of unique attribute label values) x (number of unique task label values). Each array value represents the bias amplification of that particular task given that particular attribute.
+"""
+_CITATION = """
+@inproceedings{wang2021biasamp,
+author = {Angelina Wang and Olga Russakovsky},
+title = {Directional Bias Amplification},
+booktitle = {International Conference on Machine Learning (ICML)},
+year = {2021}
+}
+"""
+@evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION, _KWARGS_DESCRIPTION)
+class DirectionalBiasAmplification(evaluate.EvaluationModule):
+    def _info(self):
+        return evaluate.EvaluationModuleInfo(
+            description=_DESCRIPTION,
+            citation=_CITATION,
+            inputs_description=_KWARGS_DESCRIPTION,
+            features=datasets.Features(
+                {
+                    "predictions": datasets.Sequence(datasets.Value("int32")),
+                    "references": datasets.Sequence(datasets.Value("int32")),
+                    "attributes": datasets.Sequence(datasets.Value("int32")),
+                }
+            ),
+            reference_urls=["https://arxiv.org/abs/2102.12594"],
+        )
+    def _compute(self, predictions, references, attributes):
+        task_preds, task_labels, attribute_labels = predictions, references, attributes
+        assert len(task_labels.shape) == 2 and len(attribute_labels.shape) == 2, 'Please read the shape of the expected inputs, which should be "num samples" by "num classification items"'
+        assert len(task_labels) == len(attribute_labels) == len(task_preds), 'Please make sure the number of samples in the three input arrays is the same.'
+        num_t, num_a = task_labels.shape[1], attribute_labels.shape[1]
+        # only include images that have attribute(s) and task(s) associated with it
+        keep_indices = np.array(list(set(np.where(np.sum(task_labels_train, axis=1)>0)[0]).union(set(np.where(np.sum(attribute_labels_train, axis=1)>0)[0]))))
+        task_labels_ind, attribute_labels_ind = task_labels[keep_indices], attribute_labels[keep_indices]
+        # y_at calculation
+        p_at = np.zeros((num_a, num_t))
+        p_a_p_t = np.zeros((num_a, num_t))
+        num = len(task_labels)
+        for a in range(num_a):
+            for t in range(num_t):
+                t_indices = np.where(task_labels_ind[:, t]==1)[0]
+                a_indices = np.where(attribute_labels_ind[:, a]==1)[0]
+                at_indices = set(t_indices)&set(a_indices)
+                p_a_p_t[a][t] = (len(t_indices)/num)*(len(a_indices)/num)
+                p_at[a][t] = len(at_indices)/num
+        y_at = np.sign(p_at - p_a_p_t)
+        # delta_at calculation
+        t_cond_a = np.zeros((num_a, num_t))
+        that_cond_a = np.zeros((num_a, num_t))
+        for a in range(num_a):
+            for t in range(num_t):
+                t_cond_a[a][t] = np.mean(task_labels[:, t][np.where(attribute_labels[:, a]==1)[0]])
+                that_cond_a[a][t] = np.mean(task_preds[:, t][np.where(attribute_labels[:, a]==1)[0]])
+        delta_at = that_cond_a - t_cond_a
+        values = y_at*delta_at
+        val = np.nanmean(values)
+        val, values
+        return {
+            "bias_amplification": val,
+            "disagg_bias_amplification": values
+        }

requirements.txt ADDED Viewed

File without changes