LPX55 commited on
Commit
9309ac4
·
verified ·
1 Parent(s): e888dd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -59
README.md CHANGED
@@ -14,18 +14,13 @@ tags:
14
  base_model:
15
  - timm/vit_small_patch16_384.augreg_in21k_ft_in1k
16
  library_name: transformers
17
- widget:
18
- - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/tiger.jpg
19
- example_title: Tiger
20
- - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/teapot.jpg
21
- example_title: Teapot
22
  ---
23
 
24
  # Trained on 2.7M samples across 4,803 generators (see Training Data)
25
 
26
- **Uploaded for community validation as part of OpenSight** - An upcoming open-source framework for adaptive deepfake detection, inspired by methodologies in <source_id data="2411.04125v1.pdf" />.
27
 
28
- **Huggingface Spaces coming soon.** Preview:
29
 
30
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/AUmW697OefKN83BClM1ae.png)
31
 
@@ -37,55 +32,14 @@ Vision Transformer (ViT) model trained on the largest dataset to-date for detect
37
  - **Model type:** Vision Transformer (ViT-Small)
38
  - **License:** MIT (compatible with CreativeML OpenRAIL-M referenced in [2411.04125v1.pdf])
39
  - **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
 
40
 
41
- ### Model Sources
 
 
42
  - **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
43
  - **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
44
 
45
- ## Uses
46
- ### Direct Use
47
- Detect AI-generated images in:
48
- - Content moderation pipelines
49
- - Digital forensic investigations
50
-
51
- ## Bias, Risks, and Limitations
52
- - **Performance variance:** Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
53
- - **Geometric artifacts:** Struggles with rotated/flipped synthetic images
54
- - **Data bias:** Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
55
- - **ADDED BY UPLOADER**: Model is already out of date, fails to detect images on newer generation models.
56
-
57
- ## Compatibility Notice
58
- This repository contains a **Hugging Face transformers-compatible convert** for the original detection methodology from:
59
-
60
- **Original Work**
61
- "Community Forensics: Using Thousands of Generators to Train Fake Image Detectors"
62
- [arXiv:2411.04125](https://arxiv.org/abs/2411.04125v1) {{Citation from <source_id>2411.04125v1.pdf}}
63
-
64
- **Our Contributions** (Coming soon)
65
- ⎯ Conversion of original weights to HF format
66
- ⎯ Added PyTorch inference pipeline
67
- ⎯ Standardized model card documentation
68
-
69
- **No Training Performed**
70
- ⎯ Initial model weights sourced from paper authors
71
- ⎯ No architectural changes or fine-tuning applied
72
-
73
- **Verify Original Performance**
74
- Please refer to Table 3 in <source_id data="2411.04125v1.pdf" /> for baseline metrics.
75
-
76
- ## How to Use
77
-
78
- ```python
79
- from transformers import ViTImageProcessor, ViTForImageClassification
80
-
81
- processor = ViTImageProcessor.from_pretrained("[your_model_id]")
82
- model = ViTForImageClassification.from_pretrained("[your_model_id]")
83
-
84
- inputs = processor(images=image, return_tensors="pt")
85
- outputs = model(**inputs)
86
- predicted_class = outputs.logits.argmax(-1)
87
- ```
88
-
89
  ## Training Details
90
  ### Training Data
91
  - 2.7mil images from 15+ generators, 4600+ models
@@ -99,8 +53,8 @@ predicted_class = outputs.logits.argmax(-1)
99
  - **Batch Size:** 32
100
 
101
  ## Evaluation
102
- ### Testing Data
103
- - 10k held-out images (5k real/5k synthetic) from unseen Diffusion/GAN models
104
 
105
  | Metric | Value |
106
  |---------------|-------|
@@ -111,6 +65,10 @@ predicted_class = outputs.logits.argmax(-1)
111
 
112
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/g-dLzxLBw1RAuiplvFCxh.png)
113
 
 
 
 
 
114
  ## Citation
115
  **BibTeX:**
116
  ```bibtex
@@ -123,8 +81,4 @@ predicted_class = outputs.logits.argmax(-1)
123
  primaryClass={cs.CV},
124
  url={https://arxiv.org/abs/2411.04125},
125
  }
126
- ```
127
-
128
- **Model Card Authors:**
129
-
130
- Jeongsoo Park, Andrew Owens
 
14
  base_model:
15
  - timm/vit_small_patch16_384.augreg_in21k_ft_in1k
16
  library_name: transformers
 
 
 
 
 
17
  ---
18
 
19
  # Trained on 2.7M samples across 4,803 generators (see Training Data)
20
 
21
+ **Uploaded for community validation as part of OpenSight** - An upcoming open-source framework for adaptive deepfake detection.
22
 
23
+ **Project OpenSight HF Spaces coming soon with an eval playground and eventually a leaderboard. Preview:**
24
 
25
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/AUmW697OefKN83BClM1ae.png)
26
 
 
32
  - **Model type:** Vision Transformer (ViT-Small)
33
  - **License:** MIT (compatible with CreativeML OpenRAIL-M referenced in [2411.04125v1.pdf])
34
  - **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
35
+ - **Adapted for HF** inference compatibility by AI Without Borders.
36
 
37
+ **HF Space will be open sourced shortly showcasing various ways to run ultra-fast inference. Make sure to follow us for updates, as we will be releasing a slew of projects in the coming weeks.**
38
+
39
+ ### Links
40
  - **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
41
  - **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ## Training Details
44
  ### Training Data
45
  - 2.7mil images from 15+ generators, 4600+ models
 
53
  - **Batch Size:** 32
54
 
55
  ## Evaluation
56
+ ### Unverified Testing Results
57
+ - Only unverified because we currently lack resources to evaluate a dataset over 1.4T large.
58
 
59
  | Metric | Value |
60
  |---------------|-------|
 
65
 
66
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/g-dLzxLBw1RAuiplvFCxh.png)
67
 
68
+ ## Re-sampled and refined dataset
69
+
70
+ - **Coming soon™**
71
+
72
  ## Citation
73
  **BibTeX:**
74
  ```bibtex
 
81
  primaryClass={cs.CV},
82
  url={https://arxiv.org/abs/2411.04125},
83
  }
84
+ ```