Spaces:
Runtime error
Runtime error
Upload 30 files
Browse files- LICENSE +21 -0
- README.md +80 -14
- attack.sh +85 -0
- baselines.py +137 -0
- custom_datasets.py +96 -0
- data_builder.py +276 -0
- data_truncator.py +97 -0
- detect_gpt.py +295 -0
- detect_llm.py +128 -0
- detector.py +11 -0
- dna_gpt.py +211 -0
- fast_detect_gpt.py +162 -0
- gpt3to4.sh +116 -0
- gptzero.py +84 -0
- index.html +106 -0
- local_infer.py +94 -0
- main.sh +97 -0
- main_ext.sh +89 -0
- metrics.py +26 -0
- model.py +79 -0
- paraphrasing.py +106 -0
- report_results.py +490 -0
- requirements.txt +8 -3
- setup.sh +1 -0
- show_result.py +51 -0
- supervised.py +78 -0
- supervised.sh +56 -0
- temperature.sh +88 -0
- topk.sh +88 -0
- topp.sh +88 -0
LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
MIT License
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2023 Bao Guangsheng
|
| 4 |
+
|
| 5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 7 |
+
in the Software without restriction, including without limitation the rights
|
| 8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 9 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 10 |
+
furnished to do so, subject to the following conditions:
|
| 11 |
+
|
| 12 |
+
The above copyright notice and this permission notice shall be included in all
|
| 13 |
+
copies or substantial portions of the Software.
|
| 14 |
+
|
| 15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 21 |
+
SOFTWARE.
|
README.md
CHANGED
|
@@ -1,14 +1,80 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Fast-DetectGPT
|
| 2 |
+
**This code is for ICLR 2024 paper "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature"**, where we borrow or extend some code from [DetectGPT](https://github.com/eric-mitchell/detect-gpt).
|
| 3 |
+
|
| 4 |
+
[Paper](https://arxiv.org/abs/2310.05130)
|
| 5 |
+
| [LocalDemo](#local-demo)
|
| 6 |
+
| [OnlineDemo](http://region-9.autodl.pro:21504/)
|
| 7 |
+
| [OpenReview](https://openreview.net/forum?id=Bpcgcr8E8Z)
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
## Brief Intro
|
| 11 |
+
<table class="tg" style="padding-left: 30px;">
|
| 12 |
+
<tr>
|
| 13 |
+
<th class="tg-0pky">Method</th>
|
| 14 |
+
<th class="tg-0pky">5-Model Generations ↑</th>
|
| 15 |
+
<th class="tg-0pky">ChatGPT/GPT-4 Generations ↑</th>
|
| 16 |
+
<th class="tg-0pky">Speedup ↑</th>
|
| 17 |
+
</tr>
|
| 18 |
+
<tr>
|
| 19 |
+
<td class="tg-0pky">DetectGPT</td>
|
| 20 |
+
<td class="tg-0pky">0.9554</td>
|
| 21 |
+
<td class="tg-0pky">0.7225</td>
|
| 22 |
+
<td class="tg-0pky">1x</td>
|
| 23 |
+
</tr>
|
| 24 |
+
<tr>
|
| 25 |
+
<td class="tg-0pky">Fast-DetectGPT</td>
|
| 26 |
+
<td class="tg-0pky">0.9887 (relative↑ <b>74.7%</b>)</td>
|
| 27 |
+
<td class="tg-0pky">0.9338 (relative↑ <b>76.1%</b>)</td>
|
| 28 |
+
<td class="tg-0pky"><b>340x</b></td>
|
| 29 |
+
</tr>
|
| 30 |
+
</table>
|
| 31 |
+
The table shows detection accuracy (measured in AUROC) and computational speedup for machine-generated text detection. The <b>white-box setting</b> (directly using the source model) is used for detecting generations produced by five source models (5-model), whereas the <b>black-box
|
| 32 |
+
setting</b> (utilizing surrogate models) targets ChatGPT and GPT-4 generations. AUROC results are averaged across various datasets and source models. Speedup assessments were conducted on a Tesla A100 GPU.
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
## Environment
|
| 36 |
+
* Python3.8
|
| 37 |
+
* PyTorch1.10.0
|
| 38 |
+
* Setup the environment:
|
| 39 |
+
```bash setup.sh```
|
| 40 |
+
|
| 41 |
+
(Notes: our experiments are run on 1 GPU of Tesla A100 with 80G memory.)
|
| 42 |
+
|
| 43 |
+
## Local Demo
|
| 44 |
+
Please run following command locally for an interactive demo:
|
| 45 |
+
```
|
| 46 |
+
python scripts/local_infer.py
|
| 47 |
+
```
|
| 48 |
+
where the default reference and sampling models are both gpt-neo-2.7B.
|
| 49 |
+
|
| 50 |
+
We could use gpt-j-6B as the reference model to obtain more accurate detections:
|
| 51 |
+
```
|
| 52 |
+
python scripts/local_infer.py --reference_model_name gpt-j-6B
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
An example (using gpt-j-6B as the reference model) looks like
|
| 57 |
+
```
|
| 58 |
+
Please enter your text: (Press Enter twice to start processing)
|
| 59 |
+
Disguised as police, they broke through a fence on Monday evening and broke into the cargo of a Swiss-bound plane to take the valuable items. The audacious heist occurred at an airport in a small European country, leaving authorities baffled and airline officials in shock.
|
| 60 |
+
|
| 61 |
+
Fast-DetectGPT criterion is 1.9299, suggesting that the text has a probability of 87% to be machine-generated.
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
## Workspace
|
| 65 |
+
Following folders are created for our experiments:
|
| 66 |
+
* ./exp_main -> experiments for 5-model generations (main.sh).
|
| 67 |
+
* ./exp_gpt3to4 -> experiments for GPT-3, ChatGPT, and GPT-4 generations (gpt3to4.sh).
|
| 68 |
+
|
| 69 |
+
(Notes: we share <b>generations from GPT-3, ChatGPT, and GPT-4</b> in exp_gpt3to4/data for convenient reproduction.)
|
| 70 |
+
|
| 71 |
+
### Citation
|
| 72 |
+
If you find this work useful, you can cite it with the following BibTex entry:
|
| 73 |
+
|
| 74 |
+
@inproceedings{bao2023fast,
|
| 75 |
+
title={Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature},
|
| 76 |
+
author={Bao, Guangsheng and Zhao, Yanbin and Teng, Zhiyang and Yang, Linyi and Zhang, Yue},
|
| 77 |
+
booktitle={The Twelfth International Conference on Learning Representations},
|
| 78 |
+
year={2023}
|
| 79 |
+
}
|
| 80 |
+
|
attack.sh
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
para=t5 # "t5" for paraphrasing attack, or "random" for decoherence attack
|
| 13 |
+
exp_path=exp_attack
|
| 14 |
+
data_path=$exp_path/data
|
| 15 |
+
res_path=$exp_path/results
|
| 16 |
+
mkdir -p $exp_path $data_path $res_path
|
| 17 |
+
|
| 18 |
+
src_path=exp_gpt3to4
|
| 19 |
+
src_data_path=$src_path/data
|
| 20 |
+
|
| 21 |
+
datasets="xsum writing pubmed"
|
| 22 |
+
source_models="gpt-3.5-turbo"
|
| 23 |
+
|
| 24 |
+
# preparing dataset
|
| 25 |
+
for D in $datasets; do
|
| 26 |
+
for M in $source_models; do
|
| 27 |
+
echo `date`, Preparing dataset ${D}_${M} by paraphrasing ${src_data_path}/${D}_${M} ...
|
| 28 |
+
python scripts/paraphrasing.py --dataset $D --dataset_file $src_data_path/${D}_${M} \
|
| 29 |
+
--paraphraser $para --output_file $data_path/${D}_${M}
|
| 30 |
+
done
|
| 31 |
+
done
|
| 32 |
+
|
| 33 |
+
# evaluate Fast-DetectGPT in the black-box setting
|
| 34 |
+
settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
|
| 35 |
+
for D in $datasets; do
|
| 36 |
+
for M in $source_models; do
|
| 37 |
+
for S in $settings; do
|
| 38 |
+
IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
|
| 39 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 40 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
|
| 41 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 42 |
+
done
|
| 43 |
+
done
|
| 44 |
+
done
|
| 45 |
+
|
| 46 |
+
# evaluate supervised detectors
|
| 47 |
+
supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
|
| 48 |
+
for D in $datasets; do
|
| 49 |
+
for M in $source_models; do
|
| 50 |
+
for SM in $supervised_models; do
|
| 51 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
| 52 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
| 53 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 54 |
+
done
|
| 55 |
+
done
|
| 56 |
+
done
|
| 57 |
+
|
| 58 |
+
# evaluate fast baselines
|
| 59 |
+
scoring_models="gpt-neo-2.7B"
|
| 60 |
+
for D in $datasets; do
|
| 61 |
+
for M in $source_models; do
|
| 62 |
+
for M2 in $scoring_models; do
|
| 63 |
+
echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
|
| 64 |
+
python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
|
| 65 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
| 66 |
+
done
|
| 67 |
+
done
|
| 68 |
+
done
|
| 69 |
+
|
| 70 |
+
# evaluate DetectGPT and DetectLLM
|
| 71 |
+
scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
|
| 72 |
+
for D in $datasets; do
|
| 73 |
+
for M in $source_models; do
|
| 74 |
+
M1=t5-11b # perturbation model
|
| 75 |
+
for M2 in $scoring_models; do
|
| 76 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 77 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 78 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 79 |
+
# we leverage DetectGPT to generate the perturbations
|
| 80 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 81 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 82 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 83 |
+
done
|
| 84 |
+
done
|
| 85 |
+
done
|
baselines.py
ADDED
|
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
import numpy as np
|
| 7 |
+
import torch
|
| 8 |
+
import torch.nn.functional as F
|
| 9 |
+
import tqdm
|
| 10 |
+
import argparse
|
| 11 |
+
import json
|
| 12 |
+
from data_builder import load_data
|
| 13 |
+
from model import load_tokenizer, load_model
|
| 14 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 15 |
+
|
| 16 |
+
def get_likelihood(logits, labels):
|
| 17 |
+
assert logits.shape[0] == 1
|
| 18 |
+
assert labels.shape[0] == 1
|
| 19 |
+
|
| 20 |
+
logits = logits.view(-1, logits.shape[-1])
|
| 21 |
+
labels = labels.view(-1)
|
| 22 |
+
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
|
| 23 |
+
log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
|
| 24 |
+
return log_likelihood.mean().item()
|
| 25 |
+
|
| 26 |
+
def get_rank(logits, labels):
|
| 27 |
+
assert logits.shape[0] == 1
|
| 28 |
+
assert labels.shape[0] == 1
|
| 29 |
+
|
| 30 |
+
# get rank of each label token in the model's likelihood ordering
|
| 31 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
| 32 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
| 33 |
+
|
| 34 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
| 35 |
+
|
| 36 |
+
# make sure we got exactly one match for each timestep in the sequence
|
| 37 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
| 38 |
+
|
| 39 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
| 40 |
+
return -ranks.mean().item()
|
| 41 |
+
|
| 42 |
+
def get_logrank(logits, labels):
|
| 43 |
+
assert logits.shape[0] == 1
|
| 44 |
+
assert labels.shape[0] == 1
|
| 45 |
+
|
| 46 |
+
# get rank of each label token in the model's likelihood ordering
|
| 47 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
| 48 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
| 49 |
+
|
| 50 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
| 51 |
+
|
| 52 |
+
# make sure we got exactly one match for each timestep in the sequence
|
| 53 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
| 54 |
+
|
| 55 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
| 56 |
+
ranks = torch.log(ranks)
|
| 57 |
+
return -ranks.mean().item()
|
| 58 |
+
|
| 59 |
+
def get_entropy(logits, labels):
|
| 60 |
+
assert logits.shape[0] == 1
|
| 61 |
+
assert labels.shape[0] == 1
|
| 62 |
+
|
| 63 |
+
entropy = F.softmax(logits, dim=-1) * F.log_softmax(logits, dim=-1)
|
| 64 |
+
entropy = -entropy.sum(-1)
|
| 65 |
+
return entropy.mean().item()
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
def experiment(args):
|
| 69 |
+
# load model
|
| 70 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
| 71 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
| 72 |
+
scoring_model.eval()
|
| 73 |
+
# load data
|
| 74 |
+
data = load_data(args.dataset_file)
|
| 75 |
+
n_samples = len(data["sampled"])
|
| 76 |
+
# eval criterions
|
| 77 |
+
criterion_fns = {'likelihood': get_likelihood,
|
| 78 |
+
'rank': get_rank,
|
| 79 |
+
'logrank': get_logrank,
|
| 80 |
+
'entropy': get_entropy}
|
| 81 |
+
for name in criterion_fns:
|
| 82 |
+
criterion_fn = criterion_fns[name]
|
| 83 |
+
torch.manual_seed(args.seed)
|
| 84 |
+
np.random.seed(args.seed)
|
| 85 |
+
eval_results = []
|
| 86 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 87 |
+
original_text = data["original"][idx]
|
| 88 |
+
sampled_text = data["sampled"][idx]
|
| 89 |
+
# original text
|
| 90 |
+
tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 91 |
+
labels = tokenized.input_ids[:, 1:]
|
| 92 |
+
with torch.no_grad():
|
| 93 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
| 94 |
+
original_crit = criterion_fn(logits, labels)
|
| 95 |
+
# sampled text
|
| 96 |
+
tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 97 |
+
labels = tokenized.input_ids[:, 1:]
|
| 98 |
+
with torch.no_grad():
|
| 99 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
| 100 |
+
sampled_crit = criterion_fn(logits, labels)
|
| 101 |
+
# result
|
| 102 |
+
eval_results.append({"original": original_text,
|
| 103 |
+
"original_crit": original_crit,
|
| 104 |
+
"sampled": sampled_text,
|
| 105 |
+
"sampled_crit": sampled_crit})
|
| 106 |
+
|
| 107 |
+
# compute prediction scores for real/sampled passages
|
| 108 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
| 109 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
| 110 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 111 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 112 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 113 |
+
# log results
|
| 114 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 115 |
+
results = { 'name': f'{name}_threshold',
|
| 116 |
+
'info': {'n_samples': n_samples},
|
| 117 |
+
'predictions': predictions,
|
| 118 |
+
'raw_results': eval_results,
|
| 119 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 120 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 121 |
+
'loss': 1 - pr_auc}
|
| 122 |
+
with open(results_file, 'w') as fout:
|
| 123 |
+
json.dump(results, fout)
|
| 124 |
+
print(f'Results written into {results_file}')
|
| 125 |
+
|
| 126 |
+
if __name__ == '__main__':
|
| 127 |
+
parser = argparse.ArgumentParser()
|
| 128 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 129 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 130 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
| 131 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
| 132 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 133 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 134 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 135 |
+
args = parser.parse_args()
|
| 136 |
+
|
| 137 |
+
experiment(args)
|
custom_datasets.py
ADDED
|
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os.path
|
| 2 |
+
import random
|
| 3 |
+
import datasets
|
| 4 |
+
|
| 5 |
+
SEPARATOR = '<<<SEP>>>'
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
DATASETS = ['writing', 'english', 'german', 'pubmed']
|
| 9 |
+
|
| 10 |
+
def load_dataset(path, name=None, split=None, cache_dir=None):
|
| 11 |
+
# use local model if it exists
|
| 12 |
+
local_path = os.path.join(cache_dir, f'local.{path}_{name}_{split}')
|
| 13 |
+
if os.path.exists(local_path):
|
| 14 |
+
return datasets.load_from_disk(local_path)
|
| 15 |
+
return datasets.load_dataset(path, name, split=split, cache_dir=cache_dir)
|
| 16 |
+
|
| 17 |
+
def load_pubmed(cache_dir):
|
| 18 |
+
data = load_dataset('pubmed_qa', 'pqa_labeled', split='train', cache_dir=cache_dir)
|
| 19 |
+
|
| 20 |
+
# combine question and long_answer
|
| 21 |
+
data = [f'Question: {q} Answer:{SEPARATOR}{a}' for q, a in zip(data['question'], data['long_answer'])]
|
| 22 |
+
|
| 23 |
+
return data
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def process_prompt(prompt):
|
| 27 |
+
return prompt.replace('[ WP ]', '').replace('[ OT ]', '')
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def process_spaces(story):
|
| 31 |
+
return story.replace(
|
| 32 |
+
' ,', ',').replace(
|
| 33 |
+
' .', '.').replace(
|
| 34 |
+
' ?', '?').replace(
|
| 35 |
+
' !', '!').replace(
|
| 36 |
+
' ;', ';').replace(
|
| 37 |
+
' \'', '\'').replace(
|
| 38 |
+
' ’ ', '\'').replace(
|
| 39 |
+
' :', ':').replace(
|
| 40 |
+
'<newline>', '\n').replace(
|
| 41 |
+
'`` ', '"').replace(
|
| 42 |
+
' \'\'', '"').replace(
|
| 43 |
+
'\'\'', '"').replace(
|
| 44 |
+
'.. ', '... ').replace(
|
| 45 |
+
' )', ')').replace(
|
| 46 |
+
'( ', '(').replace(
|
| 47 |
+
' n\'t', 'n\'t').replace(
|
| 48 |
+
' i ', ' I ').replace(
|
| 49 |
+
' i\'', ' I\'').replace(
|
| 50 |
+
'\\\'', '\'').replace(
|
| 51 |
+
'\n ', '\n').strip()
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def load_writing(cache_dir=None):
|
| 55 |
+
writing_path = 'data/writingPrompts'
|
| 56 |
+
|
| 57 |
+
with open(f'{writing_path}/valid.wp_source', 'r') as f:
|
| 58 |
+
prompts = f.readlines()
|
| 59 |
+
with open(f'{writing_path}/valid.wp_target', 'r') as f:
|
| 60 |
+
stories = f.readlines()
|
| 61 |
+
|
| 62 |
+
prompts = [process_prompt(prompt) for prompt in prompts]
|
| 63 |
+
joined = [process_spaces(prompt + " " + story) for prompt, story in zip(prompts, stories)]
|
| 64 |
+
filtered = [story for story in joined if 'nsfw' not in story and 'NSFW' not in story]
|
| 65 |
+
|
| 66 |
+
random.seed(0)
|
| 67 |
+
random.shuffle(filtered)
|
| 68 |
+
|
| 69 |
+
return filtered
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
def load_language(language, cache_dir):
|
| 73 |
+
# load either the english or german portion of the wmt16 dataset
|
| 74 |
+
assert language in ['en', 'de']
|
| 75 |
+
d = load_dataset('wmt16', 'de-en', split='train', cache_dir=cache_dir)
|
| 76 |
+
docs = d['translation']
|
| 77 |
+
desired_language_docs = [d[language] for d in docs]
|
| 78 |
+
lens = [len(d.split()) for d in desired_language_docs]
|
| 79 |
+
sub = [d for d, l in zip(desired_language_docs, lens) if l > 100 and l < 150]
|
| 80 |
+
return sub
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
def load_german(cache_dir):
|
| 84 |
+
return load_language('de', cache_dir)
|
| 85 |
+
|
| 86 |
+
|
| 87 |
+
def load_english(cache_dir):
|
| 88 |
+
return load_language('en', cache_dir)
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
def load(name, cache_dir, **kwargs):
|
| 92 |
+
if name in DATASETS:
|
| 93 |
+
load_fn = globals()[f'load_{name}']
|
| 94 |
+
return load_fn(cache_dir=cache_dir, **kwargs)
|
| 95 |
+
else:
|
| 96 |
+
raise ValueError(f'Unknown dataset {name}')
|
data_builder.py
ADDED
|
@@ -0,0 +1,276 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import time
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import datasets
|
| 9 |
+
import torch
|
| 10 |
+
import random
|
| 11 |
+
import argparse
|
| 12 |
+
import os
|
| 13 |
+
import json
|
| 14 |
+
import custom_datasets
|
| 15 |
+
from model import load_tokenizer, load_model
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
def save_data(output_file, args, data):
|
| 19 |
+
# write args to file
|
| 20 |
+
args_file = f"{output_file}.args.json"
|
| 21 |
+
with open(args_file, "w") as fout:
|
| 22 |
+
json.dump(args.__dict__, fout, indent=4)
|
| 23 |
+
print(f"Args written into {args_file}")
|
| 24 |
+
|
| 25 |
+
# write the data to a json file in the save folder
|
| 26 |
+
data_file = f"{output_file}.raw_data.json"
|
| 27 |
+
with open(data_file, "w") as fout:
|
| 28 |
+
json.dump(data, fout, indent=4)
|
| 29 |
+
print(f"Raw data written into {data_file}")
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
def load_data(input_file):
|
| 33 |
+
data_file = f"{input_file}.raw_data.json"
|
| 34 |
+
with open(data_file, "r") as fin:
|
| 35 |
+
data = json.load(fin)
|
| 36 |
+
print(f"Raw data loaded from {data_file}")
|
| 37 |
+
return data
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
class DataBuilder:
|
| 41 |
+
def __init__(self, args):
|
| 42 |
+
self.args = args
|
| 43 |
+
self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
|
| 44 |
+
self.base_model = None if args.openai_model else load_model(args.base_model_name, args.device, args.cache_dir)
|
| 45 |
+
|
| 46 |
+
def _openai_sample(self, prefix):
|
| 47 |
+
def _drop_last_word(text):
|
| 48 |
+
return ' '.join(text.split(' ')[:-1])
|
| 49 |
+
|
| 50 |
+
import openai
|
| 51 |
+
assert self.args.openai_key is not None, "Must provide OpenAI API key as --openai_key"
|
| 52 |
+
openai.api_key = self.args.openai_key
|
| 53 |
+
if self.args.openai_base is not None:
|
| 54 |
+
openai.api_base = self.args.openai_base
|
| 55 |
+
|
| 56 |
+
if self.args.dataset != 'pubmed': # keep Answer: prefix for pubmed
|
| 57 |
+
prefix = _drop_last_word(prefix)
|
| 58 |
+
|
| 59 |
+
# sample from the openai model
|
| 60 |
+
kwargs = {"max_tokens": 200}
|
| 61 |
+
if self.args.do_top_p:
|
| 62 |
+
kwargs['top_p'] = self.args.top_p
|
| 63 |
+
elif self.args.do_top_k:
|
| 64 |
+
kwargs['top_k'] = self.args.top_k
|
| 65 |
+
elif self.args.do_temperature:
|
| 66 |
+
kwargs['temperature'] = self.args.temperature
|
| 67 |
+
|
| 68 |
+
if self.args.openai_model == 'davinci':
|
| 69 |
+
kwargs["engine"] = self.args.openai_model
|
| 70 |
+
response = openai.Completion.create(prompt=f"{prefix}", **kwargs)
|
| 71 |
+
return prefix + response['choices'][0]['text']
|
| 72 |
+
|
| 73 |
+
elif self.args.openai_model in ['gpt-3.5-turbo', 'gpt-4']:
|
| 74 |
+
roles = {'xsum': 'You are a News writer.',
|
| 75 |
+
'writing': 'You are a Fiction writer.',
|
| 76 |
+
'pubmed': 'You are a Technical writer.'}
|
| 77 |
+
prompts = {'xsum': 'Please write an article with about 150 words starting exactly with:',
|
| 78 |
+
'writing': 'Please write an article with about 150 words starting exactly with:',
|
| 79 |
+
'pubmed': 'Please answer the question in about 50 words.'}
|
| 80 |
+
messages = [
|
| 81 |
+
{'role': 'system', 'content': roles[self.args.dataset]},
|
| 82 |
+
{'role': 'user', 'content': f'{prompts[self.args.dataset]} {prefix}'},
|
| 83 |
+
]
|
| 84 |
+
kwargs["model"] = self.args.openai_model
|
| 85 |
+
kwargs["messages"] = messages
|
| 86 |
+
response = openai.ChatCompletion.create(**kwargs)
|
| 87 |
+
response = response['choices'][0]['message']['content']
|
| 88 |
+
# ChatGPT may repeat the prefix
|
| 89 |
+
if response.startswith(prefix[:20]):
|
| 90 |
+
return response
|
| 91 |
+
return prefix + ' ' + response
|
| 92 |
+
|
| 93 |
+
else:
|
| 94 |
+
raise NotImplementedError
|
| 95 |
+
|
| 96 |
+
# sample from base_model using ****only**** the first 30 tokens in each example as context
|
| 97 |
+
def _sample_from_model(self, texts, min_words=55, prompt_tokens=30):
|
| 98 |
+
# encode each text as a list of token ids
|
| 99 |
+
if self.args.dataset == 'pubmed':
|
| 100 |
+
texts = [t[:t.index(custom_datasets.SEPARATOR)] for t in texts]
|
| 101 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
|
| 102 |
+
else:
|
| 103 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
|
| 104 |
+
all_encoded = {key: value[:, :prompt_tokens] for key, value in all_encoded.items()}
|
| 105 |
+
|
| 106 |
+
if self.args.openai_model:
|
| 107 |
+
# decode the prefixes back into text
|
| 108 |
+
prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
|
| 109 |
+
|
| 110 |
+
decoded = []
|
| 111 |
+
for idx, prefix in enumerate(prefixes):
|
| 112 |
+
while idx >= len(decoded):
|
| 113 |
+
try:
|
| 114 |
+
decoded.append(self._openai_sample(prefix))
|
| 115 |
+
except Exception as ex:
|
| 116 |
+
print(ex)
|
| 117 |
+
print('Wait 10 minutes before retry ...')
|
| 118 |
+
time.sleep(600)
|
| 119 |
+
|
| 120 |
+
else:
|
| 121 |
+
self.base_model.eval()
|
| 122 |
+
decoded = ['' for _ in range(len(texts))]
|
| 123 |
+
|
| 124 |
+
# sample from the model until we get a sample with at least min_words words for each example
|
| 125 |
+
# this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
|
| 126 |
+
tries = 0
|
| 127 |
+
m = 0
|
| 128 |
+
while m < min_words:
|
| 129 |
+
if tries != 0:
|
| 130 |
+
print()
|
| 131 |
+
print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
|
| 132 |
+
prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
|
| 133 |
+
for prefix, x in zip(prefixes, decoded):
|
| 134 |
+
if len(x.split()) == m:
|
| 135 |
+
print(prefix, '=>', x)
|
| 136 |
+
|
| 137 |
+
sampling_kwargs = {}
|
| 138 |
+
if self.args.do_top_p:
|
| 139 |
+
sampling_kwargs['top_p'] = self.args.top_p
|
| 140 |
+
elif self.args.do_top_k:
|
| 141 |
+
sampling_kwargs['top_k'] = self.args.top_k
|
| 142 |
+
elif self.args.do_temperature:
|
| 143 |
+
sampling_kwargs['temperature'] = self.args.temperature
|
| 144 |
+
min_length = 50 if self.args.dataset in ['pubmed'] else 150
|
| 145 |
+
outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
|
| 146 |
+
**sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
|
| 147 |
+
eos_token_id=self.base_tokenizer.eos_token_id)
|
| 148 |
+
decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
| 149 |
+
m = min(len(x.split()) for x in decoded)
|
| 150 |
+
tries += 1
|
| 151 |
+
|
| 152 |
+
return decoded
|
| 153 |
+
|
| 154 |
+
def generate_samples(self, raw_data, batch_size):
|
| 155 |
+
# trim to shorter length
|
| 156 |
+
def _trim_to_shorter_length(texta, textb):
|
| 157 |
+
# truncate to shorter of o and s
|
| 158 |
+
shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
|
| 159 |
+
texta = ' '.join(texta.split(' ')[:shorter_length])
|
| 160 |
+
textb = ' '.join(textb.split(' ')[:shorter_length])
|
| 161 |
+
return texta, textb
|
| 162 |
+
|
| 163 |
+
def _truncate_to_substring(text, substring, idx_occurrence):
|
| 164 |
+
# truncate everything after the idx_occurrence occurrence of substring
|
| 165 |
+
assert idx_occurrence > 0, 'idx_occurrence must be > 0'
|
| 166 |
+
idx = -1
|
| 167 |
+
for _ in range(idx_occurrence):
|
| 168 |
+
idx = text.find(substring, idx + 1)
|
| 169 |
+
if idx == -1:
|
| 170 |
+
return text
|
| 171 |
+
return text[:idx]
|
| 172 |
+
|
| 173 |
+
data = {
|
| 174 |
+
"original": [],
|
| 175 |
+
"sampled": [],
|
| 176 |
+
}
|
| 177 |
+
|
| 178 |
+
for batch in range(len(raw_data) // batch_size):
|
| 179 |
+
print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
|
| 180 |
+
original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
|
| 181 |
+
sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55)
|
| 182 |
+
|
| 183 |
+
for o, s in zip(original_text, sampled_text):
|
| 184 |
+
if self.args.dataset == 'pubmed':
|
| 185 |
+
s = _truncate_to_substring(s, 'Question:', 2)
|
| 186 |
+
o = o.replace(custom_datasets.SEPARATOR, ' ')
|
| 187 |
+
|
| 188 |
+
o, s = _trim_to_shorter_length(o, s)
|
| 189 |
+
|
| 190 |
+
# add to the data
|
| 191 |
+
data["original"].append(o)
|
| 192 |
+
data["sampled"].append(s)
|
| 193 |
+
|
| 194 |
+
return data
|
| 195 |
+
|
| 196 |
+
def generate_data(args, dataset, key):
|
| 197 |
+
# strip newlines from each example; replace one or more newlines with a single space
|
| 198 |
+
def _strip_newlines(text):
|
| 199 |
+
return ' '.join(text.split())
|
| 200 |
+
|
| 201 |
+
# load data
|
| 202 |
+
if dataset in custom_datasets.DATASETS:
|
| 203 |
+
data = custom_datasets.load(dataset, args.cache_dir)
|
| 204 |
+
else:
|
| 205 |
+
data = custom_datasets.load_dataset(dataset, split='train', cache_dir=args.cache_dir)[key]
|
| 206 |
+
|
| 207 |
+
# get unique examples, strip whitespace, and remove newlines
|
| 208 |
+
# then take just the long examples, shuffle, take the first 5,000 to tokenize to save time
|
| 209 |
+
# then take just the examples that are <= 512 tokens (for the base model)
|
| 210 |
+
# then generate n_samples samples
|
| 211 |
+
|
| 212 |
+
# remove duplicates from the data
|
| 213 |
+
data = list(dict.fromkeys(data)) # deterministic, as opposed to set()
|
| 214 |
+
|
| 215 |
+
# strip whitespace around each example
|
| 216 |
+
data = [x.strip() for x in data]
|
| 217 |
+
|
| 218 |
+
# remove newlines from each example
|
| 219 |
+
data = [_strip_newlines(x) for x in data]
|
| 220 |
+
|
| 221 |
+
# try to keep only examples with > 250 words
|
| 222 |
+
if dataset in ['writing', 'squad', 'xsum']:
|
| 223 |
+
long_data = [x for x in data if len(x.split()) > 250]
|
| 224 |
+
if len(long_data) > 0:
|
| 225 |
+
data = long_data
|
| 226 |
+
|
| 227 |
+
random.shuffle(data)
|
| 228 |
+
data = data[:5_000]
|
| 229 |
+
|
| 230 |
+
# keep only examples with <= 512 tokens according to base_tokenizer
|
| 231 |
+
# this step has the extra effect of removing examples with low-quality/garbage content
|
| 232 |
+
data_builder = DataBuilder(args)
|
| 233 |
+
tokenized_data = data_builder.base_tokenizer(data)
|
| 234 |
+
data = [x for x, y in zip(data, tokenized_data["input_ids"]) if len(y) <= 512]
|
| 235 |
+
|
| 236 |
+
# print stats about remaining data
|
| 237 |
+
print(f"Total number of samples: {len(data)}")
|
| 238 |
+
print(f"Average number of words: {np.mean([len(x.split()) for x in data])}")
|
| 239 |
+
|
| 240 |
+
return data_builder.generate_samples(data[:args.n_samples], batch_size=args.batch_size)
|
| 241 |
+
|
| 242 |
+
if __name__ == '__main__':
|
| 243 |
+
parser = argparse.ArgumentParser()
|
| 244 |
+
parser.add_argument('--output_file', type=str, default="./exp_gpt3/data/xsum_gpt2")
|
| 245 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 246 |
+
parser.add_argument('--n_samples', type=int, default=200)
|
| 247 |
+
parser.add_argument('--openai_base', type=str, default=None)
|
| 248 |
+
parser.add_argument('--openai_key', type=str, default=None)
|
| 249 |
+
parser.add_argument('--openai_model', type=str, default=None) # davinci, gpt-3.5-turbo, gpt-4
|
| 250 |
+
parser.add_argument('--base_model_name', type=str, default="gpt2")
|
| 251 |
+
parser.add_argument('--batch_size', type=int, default=50)
|
| 252 |
+
parser.add_argument('--do_top_k', action='store_true')
|
| 253 |
+
parser.add_argument('--top_k', type=int, default=40)
|
| 254 |
+
parser.add_argument('--do_top_p', action='store_true')
|
| 255 |
+
parser.add_argument('--top_p', type=float, default=0.96)
|
| 256 |
+
parser.add_argument('--do_temperature', action='store_true')
|
| 257 |
+
parser.add_argument('--temperature', type=float, default=0.8)
|
| 258 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 259 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 260 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 261 |
+
args = parser.parse_args()
|
| 262 |
+
|
| 263 |
+
os.environ["XDG_CACHE_HOME"] = args.cache_dir
|
| 264 |
+
if not os.path.exists(args.cache_dir):
|
| 265 |
+
os.makedirs(args.cache_dir)
|
| 266 |
+
print(f"Using cache dir {args.cache_dir}")
|
| 267 |
+
|
| 268 |
+
random.seed(args.seed)
|
| 269 |
+
torch.manual_seed(args.seed)
|
| 270 |
+
np.random.seed(args.seed)
|
| 271 |
+
|
| 272 |
+
print(f'Loading dataset {args.dataset}...')
|
| 273 |
+
dataset_keys = {'xsum': 'document', 'squad': 'context', 'writing': 'document'}
|
| 274 |
+
data = generate_data(args, args.dataset, dataset_keys[args.dataset] if args.dataset in dataset_keys else None)
|
| 275 |
+
|
| 276 |
+
save_data(args.output_file, args, data)
|
data_truncator.py
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import time
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import datasets
|
| 9 |
+
import torch
|
| 10 |
+
import random
|
| 11 |
+
import argparse
|
| 12 |
+
import os
|
| 13 |
+
import json
|
| 14 |
+
import custom_datasets
|
| 15 |
+
from model import load_tokenizer, load_model
|
| 16 |
+
|
| 17 |
+
def stats_str(data):
|
| 18 |
+
if type(data) == dict:
|
| 19 |
+
mean_orig = np.mean([len(v.split()) for v in data['original']])
|
| 20 |
+
mean_samp = np.mean([len(v.split()) for v in data['sampled']])
|
| 21 |
+
return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled).'
|
| 22 |
+
else:
|
| 23 |
+
mean_orig = np.mean([len(v['original'].split()) for v in data])
|
| 24 |
+
mean_samp = np.mean([len(v['sampled'].split()) for v in data])
|
| 25 |
+
mean_perturb_orig = np.mean([np.mean([len(p.split()) for p in v['perturbed_original']]) for v in data])
|
| 26 |
+
mean_perturb_samp = np.mean([np.mean([len(p.split()) for p in v['perturbed_sampled']]) for v in data])
|
| 27 |
+
return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled), {mean_perturb_orig:.0f} words (perturb original), {mean_perturb_samp:.0f} words (perturb sampled).'
|
| 28 |
+
|
| 29 |
+
def save_data(output_file, args, data):
|
| 30 |
+
# write args to file
|
| 31 |
+
args_file = f"{output_file}.args.json"
|
| 32 |
+
with open(args_file, "w") as fout:
|
| 33 |
+
json.dump(args, fout, indent=4)
|
| 34 |
+
print(f"Args written into {args_file}")
|
| 35 |
+
|
| 36 |
+
# write the data to a json file in the save folder
|
| 37 |
+
data_file = f"{output_file}.raw_data.json"
|
| 38 |
+
with open(data_file, "w") as fout:
|
| 39 |
+
json.dump(data, fout, indent=4)
|
| 40 |
+
print(f"Raw data written into {data_file}: {stats_str(data)}")
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
def load_data(input_file):
|
| 44 |
+
# load args from file
|
| 45 |
+
args_file = f"{input_file}.args.json"
|
| 46 |
+
with open(args_file, "r") as fin:
|
| 47 |
+
args = json.load(fin)
|
| 48 |
+
print(f"Args loaded from {args_file}")
|
| 49 |
+
|
| 50 |
+
# load the data from file
|
| 51 |
+
data_file = f"{input_file}.raw_data.json"
|
| 52 |
+
with open(data_file, "r") as fin:
|
| 53 |
+
data = json.load(fin)
|
| 54 |
+
print(f"Raw data loaded from {data_file}: {stats_str(data)}")
|
| 55 |
+
|
| 56 |
+
return args, data
|
| 57 |
+
|
| 58 |
+
def convert_data(input_file, output_file, max_words):
|
| 59 |
+
def _reduce(text):
|
| 60 |
+
lines = []
|
| 61 |
+
nwords = 0
|
| 62 |
+
for line in text.split('\n'):
|
| 63 |
+
if nwords >= max_words:
|
| 64 |
+
break
|
| 65 |
+
words = line.split()
|
| 66 |
+
words = words[:max_words - nwords]
|
| 67 |
+
lines.append(' '.join(words))
|
| 68 |
+
nwords += len(words)
|
| 69 |
+
return '\n'.join(lines)
|
| 70 |
+
|
| 71 |
+
args, data = load_data(input_file)
|
| 72 |
+
if type(data) == dict:
|
| 73 |
+
data['original'] = [_reduce(x) for x in data['original']]
|
| 74 |
+
data['sampled'] = [_reduce(x) for x in data['sampled']]
|
| 75 |
+
else:
|
| 76 |
+
for item in data:
|
| 77 |
+
item['original'] = _reduce(item['original'])
|
| 78 |
+
item['sampled'] = _reduce(item['sampled'])
|
| 79 |
+
item['perturbed_original'] = [_reduce(x) for x in item['perturbed_original']]
|
| 80 |
+
item['perturbed_sampled'] = [_reduce(x) for x in item['perturbed_sampled']]
|
| 81 |
+
|
| 82 |
+
save_data(output_file, args, data)
|
| 83 |
+
|
| 84 |
+
if __name__ == '__main__':
|
| 85 |
+
parser = argparse.ArgumentParser()
|
| 86 |
+
parser.add_argument('--input_path', type=str, default="./exp_gpt3to4/data/")
|
| 87 |
+
parser.add_argument('--output_path', type=str, default="./exp_maxlen150/data/")
|
| 88 |
+
parser.add_argument('--max_words', type=int, default=150)
|
| 89 |
+
args = parser.parse_args()
|
| 90 |
+
|
| 91 |
+
import glob
|
| 92 |
+
import os.path as path
|
| 93 |
+
|
| 94 |
+
for file_name in glob.glob(f'{args.input_path}/*.raw_data.json'):
|
| 95 |
+
print(file_name)
|
| 96 |
+
file_name = path.basename(file_name).replace('.raw_data.json', '')
|
| 97 |
+
convert_data(path.join(args.input_path, file_name), path.join(args.output_path, file_name), args.max_words)
|
detect_gpt.py
ADDED
|
@@ -0,0 +1,295 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import os.path
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
| 9 |
+
import re
|
| 10 |
+
import torch
|
| 11 |
+
import tqdm
|
| 12 |
+
import argparse
|
| 13 |
+
import json
|
| 14 |
+
from data_builder import load_data, save_data
|
| 15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 16 |
+
from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
|
| 17 |
+
|
| 18 |
+
# define regex to match all <extra_id_*> tokens, where * is an integer
|
| 19 |
+
pattern = re.compile(r"<extra_id_\d+>")
|
| 20 |
+
|
| 21 |
+
def load_mask_model(model_name, device, cache_dir):
|
| 22 |
+
model_name = get_model_fullname(model_name)
|
| 23 |
+
# mask filling t5 model
|
| 24 |
+
print(f'Loading mask filling model {model_name}...')
|
| 25 |
+
mask_model = from_pretrained(AutoModelForSeq2SeqLM, model_name, {}, cache_dir)
|
| 26 |
+
mask_model = mask_model.to(device)
|
| 27 |
+
return mask_model
|
| 28 |
+
|
| 29 |
+
def load_mask_tokenizer(model_name, max_length, cache_dir):
|
| 30 |
+
model_name = get_model_fullname(model_name)
|
| 31 |
+
tokenizer = from_pretrained(AutoTokenizer, model_name, {'model_max_length': max_length}, cache_dir)
|
| 32 |
+
return tokenizer
|
| 33 |
+
|
| 34 |
+
def tokenize_and_mask(text, span_length, pct, ceil_pct=False):
|
| 35 |
+
buffer_size = 1
|
| 36 |
+
tokens = text.split(' ')
|
| 37 |
+
mask_string = '<<<mask>>>'
|
| 38 |
+
|
| 39 |
+
n_spans = pct * len(tokens) / (span_length + buffer_size * 2)
|
| 40 |
+
if ceil_pct:
|
| 41 |
+
n_spans = np.ceil(n_spans)
|
| 42 |
+
n_spans = int(n_spans)
|
| 43 |
+
|
| 44 |
+
n_masks = 0
|
| 45 |
+
while n_masks < n_spans:
|
| 46 |
+
start = np.random.randint(0, len(tokens) - span_length)
|
| 47 |
+
end = start + span_length
|
| 48 |
+
search_start = max(0, start - buffer_size)
|
| 49 |
+
search_end = min(len(tokens), end + buffer_size)
|
| 50 |
+
if mask_string not in tokens[search_start:search_end]:
|
| 51 |
+
tokens[start:end] = [mask_string]
|
| 52 |
+
n_masks += 1
|
| 53 |
+
|
| 54 |
+
# replace each occurrence of mask_string with <extra_id_NUM>, where NUM increments
|
| 55 |
+
num_filled = 0
|
| 56 |
+
for idx, token in enumerate(tokens):
|
| 57 |
+
if token == mask_string:
|
| 58 |
+
tokens[idx] = f'<extra_id_{num_filled}>'
|
| 59 |
+
num_filled += 1
|
| 60 |
+
assert num_filled == n_masks, f"num_filled {num_filled} != n_masks {n_masks}"
|
| 61 |
+
text = ' '.join(tokens)
|
| 62 |
+
return text
|
| 63 |
+
|
| 64 |
+
def count_masks(texts):
|
| 65 |
+
return [len([x for x in text.split() if x.startswith("<extra_id_")]) for text in texts]
|
| 66 |
+
|
| 67 |
+
# replace each masked span with a sample from T5 mask_model
|
| 68 |
+
def replace_masks(args, mask_model, mask_tokenizer, texts):
|
| 69 |
+
n_expected = count_masks(texts)
|
| 70 |
+
stop_id = mask_tokenizer.encode(f"<extra_id_{max(n_expected)}>")[0]
|
| 71 |
+
tokens = mask_tokenizer(texts, return_tensors="pt", padding=True).to(args.device)
|
| 72 |
+
outputs = mask_model.generate(**tokens, max_length=150, do_sample=True, top_p=args.mask_top_p,
|
| 73 |
+
num_return_sequences=1, eos_token_id=stop_id)
|
| 74 |
+
return mask_tokenizer.batch_decode(outputs, skip_special_tokens=False)
|
| 75 |
+
|
| 76 |
+
def extract_fills(texts):
|
| 77 |
+
# remove <pad> from beginning of each text
|
| 78 |
+
texts = [x.replace("<pad>", "").replace("</s>", "").strip() for x in texts]
|
| 79 |
+
|
| 80 |
+
# return the text in between each matched mask token
|
| 81 |
+
extracted_fills = [pattern.split(x)[1:-1] for x in texts]
|
| 82 |
+
|
| 83 |
+
# remove whitespace around each fill
|
| 84 |
+
extracted_fills = [[y.strip() for y in x] for x in extracted_fills]
|
| 85 |
+
|
| 86 |
+
return extracted_fills
|
| 87 |
+
|
| 88 |
+
def apply_extracted_fills(masked_texts, extracted_fills):
|
| 89 |
+
# split masked text into tokens, only splitting on spaces (not newlines)
|
| 90 |
+
tokens = [x.split(' ') for x in masked_texts]
|
| 91 |
+
|
| 92 |
+
n_expected = count_masks(masked_texts)
|
| 93 |
+
|
| 94 |
+
# replace each mask token with the corresponding fill
|
| 95 |
+
for idx, (text, fills, n) in enumerate(zip(tokens, extracted_fills, n_expected)):
|
| 96 |
+
if len(fills) < n:
|
| 97 |
+
tokens[idx] = []
|
| 98 |
+
else:
|
| 99 |
+
for fill_idx in range(n):
|
| 100 |
+
text[text.index(f"<extra_id_{fill_idx}>")] = fills[fill_idx]
|
| 101 |
+
|
| 102 |
+
# join tokens back into text
|
| 103 |
+
texts = [" ".join(x) for x in tokens]
|
| 104 |
+
return texts
|
| 105 |
+
|
| 106 |
+
def perturb_texts_(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
|
| 107 |
+
span_length = args.span_length
|
| 108 |
+
pct = args.pct_words_masked
|
| 109 |
+
masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for x in texts]
|
| 110 |
+
raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
|
| 111 |
+
extracted_fills = extract_fills(raw_fills)
|
| 112 |
+
perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
|
| 113 |
+
|
| 114 |
+
# Handle the fact that sometimes the model doesn't generate the right number of fills and we have to try again
|
| 115 |
+
attempts = 1
|
| 116 |
+
while '' in perturbed_texts:
|
| 117 |
+
idxs = [idx for idx, x in enumerate(perturbed_texts) if x == '']
|
| 118 |
+
print(f'WARNING: {len(idxs)} texts have no fills. Trying again [attempt {attempts}].')
|
| 119 |
+
masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for idx, x in enumerate(texts) if idx in idxs]
|
| 120 |
+
raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
|
| 121 |
+
extracted_fills = extract_fills(raw_fills)
|
| 122 |
+
new_perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
|
| 123 |
+
for idx, x in zip(idxs, new_perturbed_texts):
|
| 124 |
+
perturbed_texts[idx] = x
|
| 125 |
+
attempts += 1
|
| 126 |
+
return perturbed_texts
|
| 127 |
+
|
| 128 |
+
def perturb_texts(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
|
| 129 |
+
chunk_size = 10
|
| 130 |
+
outputs = []
|
| 131 |
+
for i in range(0, len(texts), chunk_size):
|
| 132 |
+
outputs.extend(perturb_texts_(args, mask_model, mask_tokenizer, texts[i:i + chunk_size], ceil_pct=ceil_pct))
|
| 133 |
+
return outputs
|
| 134 |
+
|
| 135 |
+
# Get the log likelihood of each text under the base_model
|
| 136 |
+
def get_ll(args, scoring_model, scoring_tokenizer, text):
|
| 137 |
+
with torch.no_grad():
|
| 138 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
| 139 |
+
labels = tokenized.input_ids
|
| 140 |
+
return -scoring_model(**tokenized, labels=labels).loss.item()
|
| 141 |
+
|
| 142 |
+
def get_lls(args, scoring_model, scoring_tokenizer, texts):
|
| 143 |
+
return [get_ll(args, scoring_model, scoring_tokenizer, text) for text in texts]
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
def generate_perturbs(args):
|
| 147 |
+
n_perturbations = args.n_perturbations
|
| 148 |
+
name = f'perturbation_{n_perturbations}'
|
| 149 |
+
# load model
|
| 150 |
+
mask_model = load_mask_model(args.mask_filling_model_name, args.device, args.cache_dir)
|
| 151 |
+
mask_model.eval()
|
| 152 |
+
try:
|
| 153 |
+
n_positions = mask_model.config.n_positions
|
| 154 |
+
except AttributeError:
|
| 155 |
+
n_positions = 512
|
| 156 |
+
mask_tokenizer = load_mask_tokenizer(args.mask_filling_model_name, n_positions, args.cache_dir)
|
| 157 |
+
|
| 158 |
+
# load data
|
| 159 |
+
data = load_data(args.dataset_file)
|
| 160 |
+
n_samples = len(data["sampled"])
|
| 161 |
+
|
| 162 |
+
torch.manual_seed(args.seed)
|
| 163 |
+
np.random.seed(args.seed)
|
| 164 |
+
|
| 165 |
+
# generate perturb samples
|
| 166 |
+
perturbs = []
|
| 167 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Perturb text"):
|
| 168 |
+
original_text = data["original"][idx]
|
| 169 |
+
sampled_text = data["sampled"][idx]
|
| 170 |
+
# perturb
|
| 171 |
+
p_sampled_text = perturb_texts(args, mask_model, mask_tokenizer, [sampled_text for _ in range(n_perturbations)])
|
| 172 |
+
p_original_text = perturb_texts(args, mask_model, mask_tokenizer, [original_text for _ in range(n_perturbations)])
|
| 173 |
+
assert len(p_sampled_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_sampled_text)}"
|
| 174 |
+
assert len(p_original_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_original_text)}"
|
| 175 |
+
# result
|
| 176 |
+
perturbs.append({
|
| 177 |
+
"original": original_text,
|
| 178 |
+
"sampled": sampled_text,
|
| 179 |
+
"perturbed_sampled": p_sampled_text,
|
| 180 |
+
"perturbed_original": p_original_text
|
| 181 |
+
})
|
| 182 |
+
|
| 183 |
+
save_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}', args, perturbs)
|
| 184 |
+
|
| 185 |
+
|
| 186 |
+
def experiment(args):
|
| 187 |
+
n_perturbations = args.n_perturbations
|
| 188 |
+
name = f'perturbation_{n_perturbations}'
|
| 189 |
+
perturb_file = f'{args.dataset_file}.{args.mask_filling_model_name}.{name}.raw_data.json'
|
| 190 |
+
if os.path.exists(perturb_file):
|
| 191 |
+
print(f'Use existing perturbation file: {perturb_file}')
|
| 192 |
+
else:
|
| 193 |
+
generate_perturbs(args)
|
| 194 |
+
# load model
|
| 195 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
| 196 |
+
scoring_model = load_model(args.scoring_model_name, 'cpu', args.cache_dir)
|
| 197 |
+
scoring_model.eval()
|
| 198 |
+
scoring_model.to(args.device)
|
| 199 |
+
# load data
|
| 200 |
+
data = load_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}')
|
| 201 |
+
n_samples = len(data)
|
| 202 |
+
|
| 203 |
+
torch.manual_seed(args.seed)
|
| 204 |
+
np.random.seed(args.seed)
|
| 205 |
+
|
| 206 |
+
# Evaluate
|
| 207 |
+
results = data
|
| 208 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 209 |
+
original_text = results[idx]["original"]
|
| 210 |
+
sampled_text = results[idx]["sampled"]
|
| 211 |
+
perturbed_original = results[idx]["perturbed_original"]
|
| 212 |
+
perturbed_sampled = results[idx]["perturbed_sampled"]
|
| 213 |
+
# original text
|
| 214 |
+
original_ll = get_ll(args, scoring_model, scoring_tokenizer, original_text)
|
| 215 |
+
p_original_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_original)
|
| 216 |
+
# sampled text
|
| 217 |
+
sampled_ll = get_ll(args, scoring_model, scoring_tokenizer, sampled_text)
|
| 218 |
+
p_sampled_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_sampled)
|
| 219 |
+
# result
|
| 220 |
+
results[idx]["original_ll"] = original_ll
|
| 221 |
+
results[idx]["sampled_ll"] = sampled_ll
|
| 222 |
+
results[idx]["all_perturbed_sampled_ll"] = p_sampled_ll
|
| 223 |
+
results[idx]["all_perturbed_original_ll"] = p_original_ll
|
| 224 |
+
results[idx]["perturbed_sampled_ll"] = np.mean(p_sampled_ll)
|
| 225 |
+
results[idx]["perturbed_original_ll"] = np.mean(p_original_ll)
|
| 226 |
+
results[idx]["perturbed_sampled_ll_std"] = np.std(p_sampled_ll) if len(p_sampled_ll) > 1 else 1
|
| 227 |
+
results[idx]["perturbed_original_ll_std"] = np.std(p_original_ll) if len(p_original_ll) > 1 else 1
|
| 228 |
+
|
| 229 |
+
# compute diffs with perturbed
|
| 230 |
+
predictions = {'real': [], 'samples': []}
|
| 231 |
+
for res in results:
|
| 232 |
+
if res['perturbed_original_ll_std'] == 0:
|
| 233 |
+
res['perturbed_original_ll_std'] = 1
|
| 234 |
+
print("WARNING: std of perturbed original is 0, setting to 1")
|
| 235 |
+
print(f"Number of unique perturbed original texts: {len(set(res['perturbed_original']))}")
|
| 236 |
+
print(f"Original text: {res['original']}")
|
| 237 |
+
if res['perturbed_sampled_ll_std'] == 0:
|
| 238 |
+
res['perturbed_sampled_ll_std'] = 1
|
| 239 |
+
print("WARNING: std of perturbed sampled is 0, setting to 1")
|
| 240 |
+
print(f"Number of unique perturbed sampled texts: {len(set(res['perturbed_sampled']))}")
|
| 241 |
+
print(f"Sampled text: {res['sampled']}")
|
| 242 |
+
predictions['real'].append((res['original_ll'] - res['perturbed_original_ll']) / res['perturbed_original_ll_std'])
|
| 243 |
+
predictions['samples'].append((res['sampled_ll'] - res['perturbed_sampled_ll']) / res['perturbed_sampled_ll_std'])
|
| 244 |
+
|
| 245 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
| 246 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 247 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 248 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 249 |
+
|
| 250 |
+
# results
|
| 251 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 252 |
+
results = {
|
| 253 |
+
'name': name,
|
| 254 |
+
'info': {
|
| 255 |
+
'pct_words_masked': args.pct_words_masked,
|
| 256 |
+
'span_length': args.span_length,
|
| 257 |
+
'n_perturbations': args.n_perturbations,
|
| 258 |
+
'n_samples': n_samples,
|
| 259 |
+
},
|
| 260 |
+
'predictions': predictions,
|
| 261 |
+
'raw_results': results,
|
| 262 |
+
'metrics': {
|
| 263 |
+
'roc_auc': roc_auc,
|
| 264 |
+
'fpr': fpr,
|
| 265 |
+
'tpr': tpr,
|
| 266 |
+
},
|
| 267 |
+
'pr_metrics': {
|
| 268 |
+
'pr_auc': pr_auc,
|
| 269 |
+
'precision': p,
|
| 270 |
+
'recall': r,
|
| 271 |
+
},
|
| 272 |
+
'loss': 1 - pr_auc,
|
| 273 |
+
}
|
| 274 |
+
with open(results_file, 'w') as fout:
|
| 275 |
+
json.dump(results, fout)
|
| 276 |
+
print(f'Results written into {results_file}')
|
| 277 |
+
|
| 278 |
+
|
| 279 |
+
if __name__ == '__main__':
|
| 280 |
+
parser = argparse.ArgumentParser()
|
| 281 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 282 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 283 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
| 284 |
+
parser.add_argument('--pct_words_masked', type=float, default=0.3) # pct masked is actually pct_words_masked * (span_length / (span_length + 2 * buffer_size))
|
| 285 |
+
parser.add_argument('--mask_top_p', type=float, default=1.0)
|
| 286 |
+
parser.add_argument('--span_length', type=int, default=2)
|
| 287 |
+
parser.add_argument('--n_perturbations', type=int, default=10)
|
| 288 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
| 289 |
+
parser.add_argument('--mask_filling_model_name', type=str, default="t5-small")
|
| 290 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 291 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 292 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 293 |
+
args = parser.parse_args()
|
| 294 |
+
|
| 295 |
+
experiment(args)
|
detect_llm.py
ADDED
|
@@ -0,0 +1,128 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
import numpy as np
|
| 7 |
+
import torch
|
| 8 |
+
import torch.nn.functional as F
|
| 9 |
+
import tqdm
|
| 10 |
+
import argparse
|
| 11 |
+
import json
|
| 12 |
+
from model import load_tokenizer, load_model
|
| 13 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 14 |
+
from data_builder import load_data
|
| 15 |
+
|
| 16 |
+
def get_likelihood(logits, labels):
|
| 17 |
+
assert logits.shape[0] == 1
|
| 18 |
+
assert labels.shape[0] == 1
|
| 19 |
+
|
| 20 |
+
logits = logits.view(-1, logits.shape[-1])
|
| 21 |
+
labels = labels.view(-1)
|
| 22 |
+
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
|
| 23 |
+
log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
|
| 24 |
+
return log_likelihood.mean().item()
|
| 25 |
+
|
| 26 |
+
def get_logrank(logits, labels):
|
| 27 |
+
assert logits.shape[0] == 1
|
| 28 |
+
assert labels.shape[0] == 1
|
| 29 |
+
|
| 30 |
+
# get rank of each label token in the model's likelihood ordering
|
| 31 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
| 32 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
| 33 |
+
|
| 34 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
| 35 |
+
|
| 36 |
+
# make sure we got exactly one match for each timestep in the sequence
|
| 37 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
| 38 |
+
|
| 39 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
| 40 |
+
ranks = torch.log(ranks)
|
| 41 |
+
return ranks.mean().item()
|
| 42 |
+
|
| 43 |
+
# Log-Likelihood Log-Rank Ratio
|
| 44 |
+
def get_lrr(args, scoring_model, scoring_tokenizer, text, perturbs):
|
| 45 |
+
with torch.no_grad():
|
| 46 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
| 47 |
+
labels = tokenized.input_ids[:, 1:]
|
| 48 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
| 49 |
+
likelihood = get_likelihood(logits, labels)
|
| 50 |
+
logrank = get_logrank(logits, labels)
|
| 51 |
+
return - likelihood / logrank
|
| 52 |
+
|
| 53 |
+
# Normalized Log-Rank Perturbation
|
| 54 |
+
def get_npr(args, scoring_model, scoring_tokenizer, text, perturbs):
|
| 55 |
+
with torch.no_grad():
|
| 56 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
| 57 |
+
labels = tokenized.input_ids[:, 1:]
|
| 58 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
| 59 |
+
logrank = get_logrank(logits, labels)
|
| 60 |
+
# perturbations
|
| 61 |
+
logranks = []
|
| 62 |
+
for perturb in perturbs:
|
| 63 |
+
tokenized = scoring_tokenizer(perturb, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
| 64 |
+
labels = tokenized.input_ids[:, 1:]
|
| 65 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
| 66 |
+
logranks.append(get_logrank(logits, labels))
|
| 67 |
+
# npr
|
| 68 |
+
return np.mean(logranks) / logrank
|
| 69 |
+
|
| 70 |
+
def experiment(args):
|
| 71 |
+
# load model
|
| 72 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
| 73 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
| 74 |
+
scoring_model.eval()
|
| 75 |
+
# load data
|
| 76 |
+
data = load_data(args.dataset_file)
|
| 77 |
+
n_samples = len(data)
|
| 78 |
+
# eval criterions
|
| 79 |
+
criterion_fns = {'lrr': get_lrr, 'npr': get_npr}
|
| 80 |
+
for name in criterion_fns:
|
| 81 |
+
criterion_fn = criterion_fns[name]
|
| 82 |
+
torch.manual_seed(args.seed)
|
| 83 |
+
np.random.seed(args.seed)
|
| 84 |
+
eval_results = []
|
| 85 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 86 |
+
original_text = data[idx]["original"]
|
| 87 |
+
sampled_text = data[idx]["sampled"]
|
| 88 |
+
perturbed_original = data[idx]["perturbed_original"]
|
| 89 |
+
perturbed_sampled = data[idx]["perturbed_sampled"]
|
| 90 |
+
original_crit = criterion_fn(args, scoring_model, scoring_tokenizer, original_text, perturbed_original)
|
| 91 |
+
sampled_crit = criterion_fn(args, scoring_model, scoring_tokenizer, sampled_text, perturbed_sampled)
|
| 92 |
+
# result
|
| 93 |
+
eval_results.append({"original": original_text,
|
| 94 |
+
"original_crit": original_crit,
|
| 95 |
+
"sampled": sampled_text,
|
| 96 |
+
"sampled_crit": sampled_crit})
|
| 97 |
+
|
| 98 |
+
# compute prediction scores for real/sampled passages
|
| 99 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
| 100 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
| 101 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 102 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 103 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 104 |
+
# log results
|
| 105 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 106 |
+
results = { 'name': f'{name}_threshold',
|
| 107 |
+
'info': {'n_samples': n_samples},
|
| 108 |
+
'predictions': predictions,
|
| 109 |
+
'raw_results': eval_results,
|
| 110 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 111 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 112 |
+
'loss': 1 - pr_auc}
|
| 113 |
+
with open(results_file, 'w') as fout:
|
| 114 |
+
json.dump(results, fout)
|
| 115 |
+
print(f'Results written into {results_file}')
|
| 116 |
+
|
| 117 |
+
if __name__ == '__main__':
|
| 118 |
+
parser = argparse.ArgumentParser()
|
| 119 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 120 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 121 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/results/xsum_gpt2.perturbation_10")
|
| 122 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
| 123 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 124 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 125 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 126 |
+
args = parser.parse_args()
|
| 127 |
+
|
| 128 |
+
experiment(args)
|
detector.py
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
class Detector:
|
| 2 |
+
def __init__(self):
|
| 3 |
+
# Model veya gerekli dosyaları yüklemek için burada yapılandırma yapabilirsiniz
|
| 4 |
+
print("Fast-DetectGPT initialized!")
|
| 5 |
+
|
| 6 |
+
def detect(self, text):
|
| 7 |
+
"""
|
| 8 |
+
Verilen metni analiz eder ve sonuç döndürür.
|
| 9 |
+
"""
|
| 10 |
+
# Gerçek analiz işlemi yerine örnek sonuç döndürülüyor
|
| 11 |
+
return [(text, 0.85)] # 0.85 AI tarafından üretilmiş olasılığıdır
|
dna_gpt.py
ADDED
|
@@ -0,0 +1,211 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import os.path
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
| 9 |
+
import re
|
| 10 |
+
import torch
|
| 11 |
+
import tqdm
|
| 12 |
+
import argparse
|
| 13 |
+
import json
|
| 14 |
+
from data_builder import load_data, save_data
|
| 15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 16 |
+
from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
|
| 17 |
+
from data_builder import load_data
|
| 18 |
+
from model import load_tokenizer, load_model
|
| 19 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 20 |
+
import custom_datasets
|
| 21 |
+
|
| 22 |
+
class PrefixSampler:
|
| 23 |
+
def __init__(self, args):
|
| 24 |
+
self.args = args
|
| 25 |
+
self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
|
| 26 |
+
self.base_model = load_model(args.base_model_name, args.device, args.cache_dir)
|
| 27 |
+
|
| 28 |
+
def _sample_from_model(self, texts, min_words=55, truncate_ratio=0.5):
|
| 29 |
+
# encode each text as a list of token ids
|
| 30 |
+
if self.args.dataset == 'pubmed':
|
| 31 |
+
pubmed_sep = ' Answer:'
|
| 32 |
+
texts = [t[:t.index(pubmed_sep) + len(pubmed_sep)] for t in texts]
|
| 33 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
|
| 34 |
+
else:
|
| 35 |
+
texts = [t.split(' ') for t in texts]
|
| 36 |
+
texts = [' '.join(t[: int(len(t) * truncate_ratio)]) for t in texts]
|
| 37 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
|
| 38 |
+
|
| 39 |
+
self.base_model.eval()
|
| 40 |
+
decoded = ['' for _ in range(len(texts))]
|
| 41 |
+
|
| 42 |
+
# sample from the model until we get a sample with at least min_words words for each example
|
| 43 |
+
# this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
|
| 44 |
+
tries = 0
|
| 45 |
+
m = 0
|
| 46 |
+
while m < min_words:
|
| 47 |
+
if tries != 0:
|
| 48 |
+
print()
|
| 49 |
+
print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
|
| 50 |
+
|
| 51 |
+
sampling_kwargs = {'temperature': self.args.temperature}
|
| 52 |
+
if self.args.do_top_p:
|
| 53 |
+
sampling_kwargs['top_p'] = self.args.top_p
|
| 54 |
+
elif self.args.do_top_k:
|
| 55 |
+
sampling_kwargs['top_k'] = self.args.top_k
|
| 56 |
+
min_length = 50 if self.args.dataset in ['pubmed'] else 150
|
| 57 |
+
outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
|
| 58 |
+
**sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
|
| 59 |
+
eos_token_id=self.base_tokenizer.eos_token_id)
|
| 60 |
+
decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
| 61 |
+
m = min(len(x.split()) for x in decoded)
|
| 62 |
+
tries += 1
|
| 63 |
+
|
| 64 |
+
return decoded
|
| 65 |
+
|
| 66 |
+
def generate_samples(self, raw_data, batch_size):
|
| 67 |
+
# trim to shorter length
|
| 68 |
+
def _trim_to_shorter_length(texta, textb):
|
| 69 |
+
# truncate to shorter of o and s
|
| 70 |
+
shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
|
| 71 |
+
texta = ' '.join(texta.split(' ')[:shorter_length])
|
| 72 |
+
textb = ' '.join(textb.split(' ')[:shorter_length])
|
| 73 |
+
return texta, textb
|
| 74 |
+
|
| 75 |
+
def _truncate_to_substring(text, substring, idx_occurrence):
|
| 76 |
+
# truncate everything after the idx_occurrence occurrence of substring
|
| 77 |
+
assert idx_occurrence > 0, 'idx_occurrence must be > 0'
|
| 78 |
+
idx = -1
|
| 79 |
+
for _ in range(idx_occurrence):
|
| 80 |
+
idx = text.find(substring, idx + 1)
|
| 81 |
+
if idx == -1:
|
| 82 |
+
return text
|
| 83 |
+
return text[:idx]
|
| 84 |
+
|
| 85 |
+
data = {
|
| 86 |
+
"original": [],
|
| 87 |
+
"sampled": [],
|
| 88 |
+
}
|
| 89 |
+
|
| 90 |
+
assert len(raw_data) % batch_size == 0
|
| 91 |
+
for batch in range(len(raw_data) // batch_size):
|
| 92 |
+
print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
|
| 93 |
+
original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
|
| 94 |
+
sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55, truncate_ratio=self.args.truncate_ratio)
|
| 95 |
+
|
| 96 |
+
for o, s in zip(original_text, sampled_text):
|
| 97 |
+
if self.args.dataset == 'pubmed':
|
| 98 |
+
s = _truncate_to_substring(s, 'Question:', 2)
|
| 99 |
+
o = o.replace(custom_datasets.SEPARATOR, ' ')
|
| 100 |
+
|
| 101 |
+
o, s = _trim_to_shorter_length(o, s)
|
| 102 |
+
|
| 103 |
+
# add to the data
|
| 104 |
+
data["original"].append(o)
|
| 105 |
+
data["sampled"].append(s)
|
| 106 |
+
|
| 107 |
+
return data
|
| 108 |
+
|
| 109 |
+
def get_likelihood(logits, labels, pad_index):
|
| 110 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
|
| 111 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
| 112 |
+
log_likelihood = lprobs.gather(dim=-1, index=labels)
|
| 113 |
+
mask = labels != pad_index
|
| 114 |
+
log_likelihood = (log_likelihood * mask).sum(dim=1) / mask.sum(dim=1)
|
| 115 |
+
return log_likelihood.squeeze(-1)
|
| 116 |
+
|
| 117 |
+
def get_log_prob(sampler, text):
|
| 118 |
+
tokenized = sampler.base_tokenizer(text, return_tensors="pt", padding=True).to(sampler.args.device)
|
| 119 |
+
labels = tokenized.input_ids[:, 1:]
|
| 120 |
+
with torch.no_grad():
|
| 121 |
+
logits_score = sampler.base_model(**tokenized).logits[:, :-1]
|
| 122 |
+
return get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
|
| 123 |
+
|
| 124 |
+
def get_log_probs(sampler, texts):
|
| 125 |
+
batch_size = sampler.args.batch_size
|
| 126 |
+
batch_lprobs = []
|
| 127 |
+
for batch in range(len(texts) // batch_size):
|
| 128 |
+
tokenized = sampler.base_tokenizer(texts[batch * batch_size:(batch + 1) * batch_size], return_tensors="pt", padding=True).to(sampler.args.device)
|
| 129 |
+
labels = tokenized.input_ids[:, 1:]
|
| 130 |
+
with torch.no_grad():
|
| 131 |
+
logits_score = sampler.base_model(**tokenized).logits[:, :-1]
|
| 132 |
+
lprobs = get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
|
| 133 |
+
batch_lprobs.append(lprobs)
|
| 134 |
+
return torch.cat(batch_lprobs, dim=0)
|
| 135 |
+
|
| 136 |
+
def get_regen_samples(sampler, text):
|
| 137 |
+
data = [text] * sampler.args.regen_number
|
| 138 |
+
data = sampler.generate_samples(data, batch_size=sampler.args.batch_size)
|
| 139 |
+
return data['sampled']
|
| 140 |
+
|
| 141 |
+
def get_dna_gpt(sampler, text):
|
| 142 |
+
lprob = get_log_prob(sampler, text)
|
| 143 |
+
regens = get_regen_samples(sampler, text)
|
| 144 |
+
lprob_regens = get_log_probs(sampler, regens)
|
| 145 |
+
wscore = lprob[0] - lprob_regens.mean()
|
| 146 |
+
return wscore.item()
|
| 147 |
+
|
| 148 |
+
def experiment(args):
|
| 149 |
+
sampler = PrefixSampler(args)
|
| 150 |
+
# load data
|
| 151 |
+
data = load_data(args.dataset_file)
|
| 152 |
+
n_samples = len(data["sampled"])
|
| 153 |
+
# evaluate criterion
|
| 154 |
+
name = "dna_gpt"
|
| 155 |
+
criterion_fn = get_dna_gpt
|
| 156 |
+
|
| 157 |
+
torch.manual_seed(args.seed)
|
| 158 |
+
np.random.seed(args.seed)
|
| 159 |
+
results = []
|
| 160 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 161 |
+
original_text = data["original"][idx]
|
| 162 |
+
sampled_text = data["sampled"][idx]
|
| 163 |
+
# original text
|
| 164 |
+
original_crit = criterion_fn(sampler, original_text)
|
| 165 |
+
# sampled text
|
| 166 |
+
sampled_crit = criterion_fn(sampler, sampled_text)
|
| 167 |
+
# result
|
| 168 |
+
results.append({"original": original_text,
|
| 169 |
+
"original_crit": original_crit,
|
| 170 |
+
"sampled": sampled_text,
|
| 171 |
+
"sampled_crit": sampled_crit})
|
| 172 |
+
|
| 173 |
+
# compute prediction scores for real/sampled passages
|
| 174 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
| 175 |
+
'samples': [x["sampled_crit"] for x in results]}
|
| 176 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 177 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 178 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 179 |
+
# results
|
| 180 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 181 |
+
results = { 'name': f'{name}_threshold',
|
| 182 |
+
'info': {'n_samples': n_samples},
|
| 183 |
+
'predictions': predictions,
|
| 184 |
+
'raw_results': results,
|
| 185 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 186 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 187 |
+
'loss': 1 - pr_auc}
|
| 188 |
+
with open(results_file, 'w') as fout:
|
| 189 |
+
json.dump(results, fout)
|
| 190 |
+
print(f'Results written into {results_file}')
|
| 191 |
+
|
| 192 |
+
if __name__ == '__main__':
|
| 193 |
+
parser = argparse.ArgumentParser()
|
| 194 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/pubmed_davinci")
|
| 195 |
+
parser.add_argument('--dataset', type=str, default="pubmed")
|
| 196 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/pubmed_davinci")
|
| 197 |
+
parser.add_argument('--truncate_ratio', type=float, default=0.5)
|
| 198 |
+
parser.add_argument('--regen_number', type=int, default=10)
|
| 199 |
+
parser.add_argument('--base_model_name', type=str, default="gpt2")
|
| 200 |
+
parser.add_argument('--batch_size', type=int, default=10)
|
| 201 |
+
parser.add_argument('--do_top_k', action='store_true')
|
| 202 |
+
parser.add_argument('--top_k', type=int, default=40)
|
| 203 |
+
parser.add_argument('--do_top_p', action='store_true')
|
| 204 |
+
parser.add_argument('--top_p', type=float, default=0.96)
|
| 205 |
+
parser.add_argument('--temperature', type=float, default=1.0)
|
| 206 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 207 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 208 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 209 |
+
args = parser.parse_args()
|
| 210 |
+
|
| 211 |
+
experiment(args)
|
fast_detect_gpt.py
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import random
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import torch
|
| 9 |
+
import torch.nn.functional as F
|
| 10 |
+
import tqdm
|
| 11 |
+
import argparse
|
| 12 |
+
import json
|
| 13 |
+
from data_builder import load_data
|
| 14 |
+
from model import load_tokenizer, load_model
|
| 15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 16 |
+
|
| 17 |
+
def get_samples(logits, labels):
|
| 18 |
+
assert logits.shape[0] == 1
|
| 19 |
+
assert labels.shape[0] == 1
|
| 20 |
+
nsamples = 10000
|
| 21 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
| 22 |
+
distrib = torch.distributions.categorical.Categorical(logits=lprobs)
|
| 23 |
+
samples = distrib.sample([nsamples]).permute([1, 2, 0])
|
| 24 |
+
return samples
|
| 25 |
+
|
| 26 |
+
def get_likelihood(logits, labels):
|
| 27 |
+
assert logits.shape[0] == 1
|
| 28 |
+
assert labels.shape[0] == 1
|
| 29 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
|
| 30 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
| 31 |
+
log_likelihood = lprobs.gather(dim=-1, index=labels)
|
| 32 |
+
return log_likelihood.mean(dim=1)
|
| 33 |
+
|
| 34 |
+
def get_sampling_discrepancy(logits_ref, logits_score, labels):
|
| 35 |
+
assert logits_ref.shape[0] == 1
|
| 36 |
+
assert logits_score.shape[0] == 1
|
| 37 |
+
assert labels.shape[0] == 1
|
| 38 |
+
if logits_ref.size(-1) != logits_score.size(-1):
|
| 39 |
+
# print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
|
| 40 |
+
vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
|
| 41 |
+
logits_ref = logits_ref[:, :, :vocab_size]
|
| 42 |
+
logits_score = logits_score[:, :, :vocab_size]
|
| 43 |
+
|
| 44 |
+
samples = get_samples(logits_ref, labels)
|
| 45 |
+
log_likelihood_x = get_likelihood(logits_score, labels)
|
| 46 |
+
log_likelihood_x_tilde = get_likelihood(logits_score, samples)
|
| 47 |
+
miu_tilde = log_likelihood_x_tilde.mean(dim=-1)
|
| 48 |
+
sigma_tilde = log_likelihood_x_tilde.std(dim=-1)
|
| 49 |
+
discrepancy = (log_likelihood_x.squeeze(-1) - miu_tilde) / sigma_tilde
|
| 50 |
+
return discrepancy.item()
|
| 51 |
+
|
| 52 |
+
def get_sampling_discrepancy_analytic(logits_ref, logits_score, labels):
|
| 53 |
+
assert logits_ref.shape[0] == 1
|
| 54 |
+
assert logits_score.shape[0] == 1
|
| 55 |
+
assert labels.shape[0] == 1
|
| 56 |
+
if logits_ref.size(-1) != logits_score.size(-1):
|
| 57 |
+
# print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
|
| 58 |
+
vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
|
| 59 |
+
logits_ref = logits_ref[:, :, :vocab_size]
|
| 60 |
+
logits_score = logits_score[:, :, :vocab_size]
|
| 61 |
+
|
| 62 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits_score.ndim - 1 else labels
|
| 63 |
+
lprobs_score = torch.log_softmax(logits_score, dim=-1)
|
| 64 |
+
probs_ref = torch.softmax(logits_ref, dim=-1)
|
| 65 |
+
log_likelihood = lprobs_score.gather(dim=-1, index=labels).squeeze(-1)
|
| 66 |
+
mean_ref = (probs_ref * lprobs_score).sum(dim=-1)
|
| 67 |
+
var_ref = (probs_ref * torch.square(lprobs_score)).sum(dim=-1) - torch.square(mean_ref)
|
| 68 |
+
discrepancy = (log_likelihood.sum(dim=-1) - mean_ref.sum(dim=-1)) / var_ref.sum(dim=-1).sqrt()
|
| 69 |
+
discrepancy = discrepancy.mean()
|
| 70 |
+
return discrepancy.item()
|
| 71 |
+
|
| 72 |
+
def experiment(args):
|
| 73 |
+
# load model
|
| 74 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
| 75 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
| 76 |
+
scoring_model.eval()
|
| 77 |
+
if args.reference_model_name != args.scoring_model_name:
|
| 78 |
+
reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
|
| 79 |
+
reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
|
| 80 |
+
reference_model.eval()
|
| 81 |
+
# load data
|
| 82 |
+
data = load_data(args.dataset_file)
|
| 83 |
+
n_samples = len(data["sampled"])
|
| 84 |
+
# evaluate criterion
|
| 85 |
+
if args.discrepancy_analytic:
|
| 86 |
+
name = "sampling_discrepancy_analytic"
|
| 87 |
+
criterion_fn = get_sampling_discrepancy_analytic
|
| 88 |
+
else:
|
| 89 |
+
name = "sampling_discrepancy"
|
| 90 |
+
criterion_fn = get_sampling_discrepancy
|
| 91 |
+
|
| 92 |
+
random.seed(args.seed)
|
| 93 |
+
torch.manual_seed(args.seed)
|
| 94 |
+
np.random.seed(args.seed)
|
| 95 |
+
results = []
|
| 96 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 97 |
+
original_text = data["original"][idx]
|
| 98 |
+
sampled_text = data["sampled"][idx]
|
| 99 |
+
# original text
|
| 100 |
+
tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 101 |
+
labels = tokenized.input_ids[:, 1:]
|
| 102 |
+
with torch.no_grad():
|
| 103 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
| 104 |
+
if args.reference_model_name == args.scoring_model_name:
|
| 105 |
+
logits_ref = logits_score
|
| 106 |
+
else:
|
| 107 |
+
tokenized = reference_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 108 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
| 109 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
| 110 |
+
original_crit = criterion_fn(logits_ref, logits_score, labels)
|
| 111 |
+
# sampled text
|
| 112 |
+
tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 113 |
+
labels = tokenized.input_ids[:, 1:]
|
| 114 |
+
with torch.no_grad():
|
| 115 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
| 116 |
+
if args.reference_model_name == args.scoring_model_name:
|
| 117 |
+
logits_ref = logits_score
|
| 118 |
+
else:
|
| 119 |
+
tokenized = reference_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 120 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
| 121 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
| 122 |
+
sampled_crit = criterion_fn(logits_ref, logits_score, labels)
|
| 123 |
+
# result
|
| 124 |
+
results.append({"original": original_text,
|
| 125 |
+
"original_crit": original_crit,
|
| 126 |
+
"sampled": sampled_text,
|
| 127 |
+
"sampled_crit": sampled_crit})
|
| 128 |
+
|
| 129 |
+
# compute prediction scores for real/sampled passages
|
| 130 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
| 131 |
+
'samples': [x["sampled_crit"] for x in results]}
|
| 132 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
| 133 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 134 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 135 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 136 |
+
# results
|
| 137 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 138 |
+
results = { 'name': f'{name}_threshold',
|
| 139 |
+
'info': {'n_samples': n_samples},
|
| 140 |
+
'predictions': predictions,
|
| 141 |
+
'raw_results': results,
|
| 142 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 143 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 144 |
+
'loss': 1 - pr_auc}
|
| 145 |
+
with open(results_file, 'w') as fout:
|
| 146 |
+
json.dump(results, fout)
|
| 147 |
+
print(f'Results written into {results_file}')
|
| 148 |
+
|
| 149 |
+
if __name__ == '__main__':
|
| 150 |
+
parser = argparse.ArgumentParser()
|
| 151 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 152 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 153 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
| 154 |
+
parser.add_argument('--reference_model_name', type=str, default="gpt2")
|
| 155 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
| 156 |
+
parser.add_argument('--discrepancy_analytic', action='store_true')
|
| 157 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 158 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 159 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 160 |
+
args = parser.parse_args()
|
| 161 |
+
|
| 162 |
+
experiment(args)
|
gpt3to4.sh
ADDED
|
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_gpt3to4
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum writing pubmed"
|
| 18 |
+
source_models="davinci gpt-3.5-turbo gpt-4"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
openai_base="https://api.openai.com/v1"
|
| 22 |
+
openai_key="xxxxxxxx" # replace with your own key for generating your own test set
|
| 23 |
+
|
| 24 |
+
# We follow DetectGPT settings for generating text from GPT-3
|
| 25 |
+
M=davinci
|
| 26 |
+
for D in $datasets; do
|
| 27 |
+
echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
|
| 28 |
+
python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
|
| 29 |
+
--dataset $D --n_samples 150 --do_top_p --top_p 0.9 --batch_size 1 \
|
| 30 |
+
--output_file $data_path/${D}_${M}
|
| 31 |
+
done
|
| 32 |
+
|
| 33 |
+
# We use a temperature of 0.8 for creativity writing
|
| 34 |
+
for M in gpt-3.5-turbo gpt-4; do
|
| 35 |
+
for D in $datasets; do
|
| 36 |
+
echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
|
| 37 |
+
python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
|
| 38 |
+
--dataset $D --n_samples 150 --do_temperature --temperature 0.8 --batch_size 1 \
|
| 39 |
+
--output_file $data_path/${D}_${M}
|
| 40 |
+
done
|
| 41 |
+
done
|
| 42 |
+
|
| 43 |
+
# evaluate Fast-DetectGPT in the black-box setting
|
| 44 |
+
settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
|
| 45 |
+
for M in $source_models; do
|
| 46 |
+
for D in $datasets; do
|
| 47 |
+
for S in $settings; do
|
| 48 |
+
IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
|
| 49 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 50 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
|
| 51 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 52 |
+
done
|
| 53 |
+
done
|
| 54 |
+
done
|
| 55 |
+
|
| 56 |
+
# evaluate supervised detectors
|
| 57 |
+
supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
|
| 58 |
+
for M in $source_models; do
|
| 59 |
+
for D in $datasets; do
|
| 60 |
+
for SM in $supervised_models; do
|
| 61 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
| 62 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
| 63 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 64 |
+
done
|
| 65 |
+
done
|
| 66 |
+
done
|
| 67 |
+
|
| 68 |
+
# evaluate baselines
|
| 69 |
+
scoring_models="gpt-neo-2.7B"
|
| 70 |
+
for M in $source_models; do
|
| 71 |
+
for D in $datasets; do
|
| 72 |
+
for M2 in $scoring_models; do
|
| 73 |
+
echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
|
| 74 |
+
python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
|
| 75 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
| 76 |
+
done
|
| 77 |
+
done
|
| 78 |
+
done
|
| 79 |
+
|
| 80 |
+
# evaluate DNA-GPT
|
| 81 |
+
scoring_models="gpt-neo-2.7B"
|
| 82 |
+
for M in $source_models; do
|
| 83 |
+
for D in $datasets; do
|
| 84 |
+
for M2 in $scoring_models; do
|
| 85 |
+
echo `date`, Evaluating DNA-GPT on ${D}_${M}.${M2} ...
|
| 86 |
+
python scripts/dna_gpt.py --base_model_name ${M2} --dataset $D \
|
| 87 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
| 88 |
+
done
|
| 89 |
+
done
|
| 90 |
+
done
|
| 91 |
+
|
| 92 |
+
# evaluate DetectGPT and DetectLLM
|
| 93 |
+
scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
|
| 94 |
+
for M in $source_models; do
|
| 95 |
+
for D in $datasets; do
|
| 96 |
+
M1=t5-11b # perturbation model
|
| 97 |
+
for M2 in $scoring_models; do
|
| 98 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 99 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 100 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 101 |
+
# we leverage DetectGPT to generate the perturbations
|
| 102 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 103 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 104 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 105 |
+
done
|
| 106 |
+
done
|
| 107 |
+
done
|
| 108 |
+
|
| 109 |
+
# evaluate GPTZero
|
| 110 |
+
for M in $source_models; do
|
| 111 |
+
for D in $datasets; do
|
| 112 |
+
echo `date`, Evaluating GPTZero on ${D}_${M} ...
|
| 113 |
+
python scripts/gptzero.py --dataset $D \
|
| 114 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 115 |
+
done
|
| 116 |
+
done
|
gptzero.py
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import time
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import tqdm
|
| 9 |
+
import argparse
|
| 10 |
+
import json
|
| 11 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 12 |
+
from data_builder import load_data
|
| 13 |
+
|
| 14 |
+
def detect_gptzero(args, text):
|
| 15 |
+
import requests
|
| 16 |
+
url = "https://api.gptzero.me/v2/predict/text"
|
| 17 |
+
payload = {
|
| 18 |
+
"document": text,
|
| 19 |
+
"version": "2023-09-14"
|
| 20 |
+
}
|
| 21 |
+
headers = {
|
| 22 |
+
"Accept": "application/json",
|
| 23 |
+
"content-type": "application/json",
|
| 24 |
+
"x-api-key": ""
|
| 25 |
+
}
|
| 26 |
+
|
| 27 |
+
while True:
|
| 28 |
+
try:
|
| 29 |
+
time.sleep(600) # 1 request per 10 minutes for free access
|
| 30 |
+
response = requests.post(url, json=payload, headers=headers)
|
| 31 |
+
return response.json()['documents'][0]['completely_generated_prob']
|
| 32 |
+
except Exception as ex:
|
| 33 |
+
print(ex)
|
| 34 |
+
|
| 35 |
+
def experiment(args):
|
| 36 |
+
# load data
|
| 37 |
+
data = load_data(args.dataset_file)
|
| 38 |
+
n_samples = len(data["sampled"])
|
| 39 |
+
# evaluate criterion
|
| 40 |
+
name = "gptzero"
|
| 41 |
+
criterion_fn = detect_gptzero
|
| 42 |
+
|
| 43 |
+
results = []
|
| 44 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 45 |
+
original_text = data["original"][idx]
|
| 46 |
+
sampled_text = data["sampled"][idx]
|
| 47 |
+
original_crit = criterion_fn(args, original_text)
|
| 48 |
+
sampled_crit = criterion_fn(args, sampled_text)
|
| 49 |
+
# result
|
| 50 |
+
results.append({"original": original_text,
|
| 51 |
+
"original_crit": original_crit,
|
| 52 |
+
"sampled": sampled_text,
|
| 53 |
+
"sampled_crit": sampled_crit})
|
| 54 |
+
|
| 55 |
+
# compute prediction scores for real/sampled passages
|
| 56 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
| 57 |
+
'samples': [x["sampled_crit"] for x in results]}
|
| 58 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
| 59 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 60 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 61 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 62 |
+
|
| 63 |
+
# results
|
| 64 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 65 |
+
results = { 'name': f'{name}_threshold',
|
| 66 |
+
'info': {'n_samples': n_samples},
|
| 67 |
+
'predictions': predictions,
|
| 68 |
+
'raw_results': results,
|
| 69 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 70 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 71 |
+
'loss': 1 - pr_auc}
|
| 72 |
+
with open(results_file, 'w') as fout:
|
| 73 |
+
json.dump(results, fout)
|
| 74 |
+
print(f'Results written into {results_file}')
|
| 75 |
+
|
| 76 |
+
if __name__ == '__main__':
|
| 77 |
+
parser = argparse.ArgumentParser()
|
| 78 |
+
parser.add_argument('--output_file', type=str, default="./exp_gpt3to4/results/xsum_gpt-4")
|
| 79 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 80 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_gpt3to4/data/xsum_gpt-4")
|
| 81 |
+
args = parser.parse_args()
|
| 82 |
+
|
| 83 |
+
experiment(args)
|
| 84 |
+
|
index.html
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>Fast-DetectGPT</title>
|
| 7 |
+
<style>
|
| 8 |
+
body {
|
| 9 |
+
font-family: Arial, sans-serif;
|
| 10 |
+
margin: 20px;
|
| 11 |
+
background-color: #f9f9f9;
|
| 12 |
+
}
|
| 13 |
+
.container {
|
| 14 |
+
max-width: 700px;
|
| 15 |
+
margin: auto;
|
| 16 |
+
background: #ffffff;
|
| 17 |
+
border-radius: 8px;
|
| 18 |
+
padding: 20px;
|
| 19 |
+
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
|
| 20 |
+
}
|
| 21 |
+
h1 {
|
| 22 |
+
text-align: center;
|
| 23 |
+
color: #333;
|
| 24 |
+
}
|
| 25 |
+
textarea {
|
| 26 |
+
width: 100%;
|
| 27 |
+
height: 150px;
|
| 28 |
+
margin: 15px 0;
|
| 29 |
+
padding: 10px;
|
| 30 |
+
border: 1px solid #ccc;
|
| 31 |
+
border-radius: 5px;
|
| 32 |
+
font-size: 16px;
|
| 33 |
+
}
|
| 34 |
+
button {
|
| 35 |
+
display: block;
|
| 36 |
+
width: 100%;
|
| 37 |
+
padding: 10px;
|
| 38 |
+
background-color: #007bff;
|
| 39 |
+
color: white;
|
| 40 |
+
border: none;
|
| 41 |
+
border-radius: 5px;
|
| 42 |
+
font-size: 16px;
|
| 43 |
+
cursor: pointer;
|
| 44 |
+
}
|
| 45 |
+
button:hover {
|
| 46 |
+
background-color: #0056b3;
|
| 47 |
+
}
|
| 48 |
+
#result {
|
| 49 |
+
margin-top: 20px;
|
| 50 |
+
padding: 15px;
|
| 51 |
+
background-color: #f1f1f1;
|
| 52 |
+
border: 1px solid #ddd;
|
| 53 |
+
border-radius: 5px;
|
| 54 |
+
}
|
| 55 |
+
.error {
|
| 56 |
+
color: red;
|
| 57 |
+
}
|
| 58 |
+
</style>
|
| 59 |
+
</head>
|
| 60 |
+
<body>
|
| 61 |
+
<div class="container">
|
| 62 |
+
<h1>Fast-DetectGPT</h1>
|
| 63 |
+
<form id="analyzeForm">
|
| 64 |
+
<textarea name="text" placeholder="Enter your text here..." required></textarea>
|
| 65 |
+
<button type="submit">Analyze</button>
|
| 66 |
+
</form>
|
| 67 |
+
<div id="result"></div>
|
| 68 |
+
</div>
|
| 69 |
+
|
| 70 |
+
<script>
|
| 71 |
+
document.getElementById('analyzeForm').addEventListener('submit', function (e) {
|
| 72 |
+
e.preventDefault(); // Formun varsayılan davranışını durdurur.
|
| 73 |
+
const formData = new FormData(this);
|
| 74 |
+
const resultDiv = document.getElementById('result');
|
| 75 |
+
|
| 76 |
+
// Önce sonucu temizle
|
| 77 |
+
resultDiv.textContent = '';
|
| 78 |
+
|
| 79 |
+
// POST isteği gönder
|
| 80 |
+
fetch('/analyze', {
|
| 81 |
+
method: 'POST',
|
| 82 |
+
headers: {
|
| 83 |
+
'Content-Type': 'application/json',
|
| 84 |
+
},
|
| 85 |
+
body: JSON.stringify({
|
| 86 |
+
text: formData.get('text'),
|
| 87 |
+
}),
|
| 88 |
+
})
|
| 89 |
+
.then(response => response.json())
|
| 90 |
+
.then(data => {
|
| 91 |
+
if (data.error) {
|
| 92 |
+
resultDiv.innerHTML = `<p class="error">Error: ${data.error}</p>`;
|
| 93 |
+
} else {
|
| 94 |
+
resultDiv.innerHTML = `
|
| 95 |
+
<p><strong>Criterion:</strong> ${data.criterion}</p>
|
| 96 |
+
<p><strong>Probability of being machine-generated:</strong> ${data.probability_machine_generated}</p>
|
| 97 |
+
`;
|
| 98 |
+
}
|
| 99 |
+
})
|
| 100 |
+
.catch(err => {
|
| 101 |
+
resultDiv.innerHTML = `<p class="error">An error occurred: ${err.message}</p>`;
|
| 102 |
+
});
|
| 103 |
+
});
|
| 104 |
+
</script>
|
| 105 |
+
</body>
|
| 106 |
+
</html>
|
local_infer.py
ADDED
|
@@ -0,0 +1,94 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import random
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import torch
|
| 9 |
+
import os
|
| 10 |
+
import glob
|
| 11 |
+
import argparse
|
| 12 |
+
import json
|
| 13 |
+
from scripts.model import load_tokenizer, load_model
|
| 14 |
+
from scripts.fast_detect_gpt import get_sampling_discrepancy_analytic
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
# estimate the probability according to the distribution of our test results on ChatGPT and GPT-4
|
| 18 |
+
class ProbEstimator:
|
| 19 |
+
def __init__(self, args):
|
| 20 |
+
self.real_crits = []
|
| 21 |
+
self.fake_crits = []
|
| 22 |
+
for result_file in glob.glob(os.path.join(args.ref_path, '*.json')):
|
| 23 |
+
with open(result_file, 'r') as fin:
|
| 24 |
+
res = json.load(fin)
|
| 25 |
+
self.real_crits.extend(res['predictions']['real'])
|
| 26 |
+
self.fake_crits.extend(res['predictions']['samples'])
|
| 27 |
+
print(f'ProbEstimator: total {len(self.real_crits) * 2} samples.')
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def crit_to_prob(self, crit):
|
| 31 |
+
offset = np.sort(np.abs(np.array(self.real_crits + self.fake_crits) - crit))[100]
|
| 32 |
+
cnt_real = np.sum((np.array(self.real_crits) > crit - offset) & (np.array(self.real_crits) < crit + offset))
|
| 33 |
+
cnt_fake = np.sum((np.array(self.fake_crits) > crit - offset) & (np.array(self.fake_crits) < crit + offset))
|
| 34 |
+
return cnt_fake / (cnt_real + cnt_fake)
|
| 35 |
+
|
| 36 |
+
# run interactive local inference
|
| 37 |
+
def run(args):
|
| 38 |
+
# load model
|
| 39 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
| 40 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
| 41 |
+
scoring_model.eval()
|
| 42 |
+
if args.reference_model_name != args.scoring_model_name:
|
| 43 |
+
reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
|
| 44 |
+
reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
|
| 45 |
+
reference_model.eval()
|
| 46 |
+
# evaluate criterion
|
| 47 |
+
name = "sampling_discrepancy_analytic"
|
| 48 |
+
criterion_fn = get_sampling_discrepancy_analytic
|
| 49 |
+
prob_estimator = ProbEstimator(args)
|
| 50 |
+
# input text
|
| 51 |
+
print('Local demo for Fast-DetectGPT, where the longer text has more reliable result.')
|
| 52 |
+
print('')
|
| 53 |
+
while True:
|
| 54 |
+
print("Please enter your text: (Press Enter twice to start processing)")
|
| 55 |
+
lines = []
|
| 56 |
+
while True:
|
| 57 |
+
line = input()
|
| 58 |
+
if len(line) == 0:
|
| 59 |
+
break
|
| 60 |
+
lines.append(line)
|
| 61 |
+
text = "\n".join(lines)
|
| 62 |
+
if len(text) == 0:
|
| 63 |
+
break
|
| 64 |
+
# evaluate text
|
| 65 |
+
tokenized = scoring_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 66 |
+
labels = tokenized.input_ids[:, 1:]
|
| 67 |
+
with torch.no_grad():
|
| 68 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
| 69 |
+
if args.reference_model_name == args.scoring_model_name:
|
| 70 |
+
logits_ref = logits_score
|
| 71 |
+
else:
|
| 72 |
+
tokenized = reference_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
| 73 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
| 74 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
| 75 |
+
crit = criterion_fn(logits_ref, logits_score, labels)
|
| 76 |
+
# estimate the probability of machine generated text
|
| 77 |
+
prob = prob_estimator.crit_to_prob(crit)
|
| 78 |
+
print(f'Fast-DetectGPT criterion is {crit:.4f}, suggesting that the text has a probability of {prob * 100:.0f}% to be machine-generated.')
|
| 79 |
+
print()
|
| 80 |
+
|
| 81 |
+
if __name__ == '__main__':
|
| 82 |
+
parser = argparse.ArgumentParser()
|
| 83 |
+
parser.add_argument('--reference_model_name', type=str, default="gpt-neo-2.7B") # use gpt-j-6B for more accurate detection
|
| 84 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt-neo-2.7B")
|
| 85 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 86 |
+
parser.add_argument('--ref_path', type=str, default="./local_infer_ref")
|
| 87 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 88 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 89 |
+
args = parser.parse_args()
|
| 90 |
+
|
| 91 |
+
run(args)
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
|
main.sh
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_main
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum squad writing"
|
| 18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
for D in $datasets; do
|
| 22 |
+
for M in $source_models; do
|
| 23 |
+
echo `date`, Preparing dataset ${D}_${M} ...
|
| 24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
|
| 25 |
+
done
|
| 26 |
+
done
|
| 27 |
+
|
| 28 |
+
# White-box Setting
|
| 29 |
+
echo `date`, Evaluate models in the white-box setting:
|
| 30 |
+
|
| 31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
| 32 |
+
for D in $datasets; do
|
| 33 |
+
for M in $source_models; do
|
| 34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
| 35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
| 36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 37 |
+
|
| 38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 41 |
+
done
|
| 42 |
+
done
|
| 43 |
+
|
| 44 |
+
# evaluate DNA-GPT
|
| 45 |
+
for D in $datasets; do
|
| 46 |
+
for M in $source_models; do
|
| 47 |
+
echo `date`, Evaluating DNA-GPT on ${D}_${M} ...
|
| 48 |
+
python scripts/dna_gpt.py --base_model_name $M --dataset $D \
|
| 49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 50 |
+
done
|
| 51 |
+
done
|
| 52 |
+
|
| 53 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 54 |
+
for D in $datasets; do
|
| 55 |
+
for M in $source_models; do
|
| 56 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
| 57 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
| 58 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 59 |
+
# we leverage DetectGPT to generate the perturbations
|
| 60 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
| 61 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
| 62 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
| 63 |
+
done
|
| 64 |
+
done
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
# Black-box Setting
|
| 68 |
+
echo `date`, Evaluate models in the black-box setting:
|
| 69 |
+
scoring_models="gpt-neo-2.7B"
|
| 70 |
+
|
| 71 |
+
# evaluate Fast-DetectGPT
|
| 72 |
+
for D in $datasets; do
|
| 73 |
+
for M in $source_models; do
|
| 74 |
+
M1=gpt-j-6B # sampling model
|
| 75 |
+
for M2 in $scoring_models; do
|
| 76 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 77 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
| 78 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 79 |
+
done
|
| 80 |
+
done
|
| 81 |
+
done
|
| 82 |
+
|
| 83 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 84 |
+
for D in $datasets; do
|
| 85 |
+
for M in $source_models; do
|
| 86 |
+
M1=t5-3b # perturbation model
|
| 87 |
+
for M2 in $scoring_models; do
|
| 88 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 89 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 90 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 91 |
+
# we leverage DetectGPT to generate the perturbations
|
| 92 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 93 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 94 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 95 |
+
done
|
| 96 |
+
done
|
| 97 |
+
done
|
main_ext.sh
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_main_ext
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum squad writing"
|
| 18 |
+
source_models="bloom-7b1 opt-13b llama-13b llama2-13b"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
for D in $datasets; do
|
| 22 |
+
for M in $source_models; do
|
| 23 |
+
echo `date`, Preparing dataset ${D}_${M} ...
|
| 24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
|
| 25 |
+
done
|
| 26 |
+
done
|
| 27 |
+
exit
|
| 28 |
+
|
| 29 |
+
# White-box Setting
|
| 30 |
+
echo `date`, Evaluate models in the white-box setting:
|
| 31 |
+
|
| 32 |
+
# evaluate Fast-DetectGPT and fast baselines
|
| 33 |
+
for D in $datasets; do
|
| 34 |
+
for M in $source_models; do
|
| 35 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
| 36 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
| 37 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 38 |
+
|
| 39 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 40 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 41 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 42 |
+
done
|
| 43 |
+
done
|
| 44 |
+
|
| 45 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 46 |
+
for D in $datasets; do
|
| 47 |
+
for M in $source_models; do
|
| 48 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
| 49 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
| 50 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 51 |
+
# we leverage DetectGPT to generate the perturbations
|
| 52 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
| 53 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
| 54 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
| 55 |
+
done
|
| 56 |
+
done
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
# Black-box Setting
|
| 60 |
+
echo `date`, Evaluate models in the black-box setting:
|
| 61 |
+
scoring_models="gpt-neo-2.7B"
|
| 62 |
+
|
| 63 |
+
# evaluate Fast-DetectGPT
|
| 64 |
+
for D in $datasets; do
|
| 65 |
+
for M in $source_models; do
|
| 66 |
+
M1=gpt-j-6B # sampling model
|
| 67 |
+
for M2 in $scoring_models; do
|
| 68 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 69 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
| 70 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 71 |
+
done
|
| 72 |
+
done
|
| 73 |
+
done
|
| 74 |
+
|
| 75 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 76 |
+
for D in $datasets; do
|
| 77 |
+
for M in $source_models; do
|
| 78 |
+
M1=t5-3b # perturbation model
|
| 79 |
+
for M2 in $scoring_models; do
|
| 80 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 81 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 82 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 83 |
+
# we leverage DetectGPT to generate the perturbations
|
| 84 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 85 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 86 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 87 |
+
done
|
| 88 |
+
done
|
| 89 |
+
done
|
metrics.py
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
import matplotlib.pyplot as plt
|
| 7 |
+
from sklearn.metrics import roc_curve, precision_recall_curve, auc
|
| 8 |
+
|
| 9 |
+
# 15 colorblind-friendly colors
|
| 10 |
+
COLORS = ["#0072B2", "#009E73", "#D55E00", "#CC79A7", "#F0E442",
|
| 11 |
+
"#56B4E9", "#E69F00", "#000000", "#0072B2", "#009E73",
|
| 12 |
+
"#D55E00", "#CC79A7", "#F0E442", "#56B4E9", "#E69F00"]
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def get_roc_metrics(real_preds, sample_preds):
|
| 16 |
+
fpr, tpr, _ = roc_curve([0] * len(real_preds) + [1] * len(sample_preds), real_preds + sample_preds)
|
| 17 |
+
roc_auc = auc(fpr, tpr)
|
| 18 |
+
return fpr.tolist(), tpr.tolist(), float(roc_auc)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def get_precision_recall_metrics(real_preds, sample_preds):
|
| 22 |
+
precision, recall, _ = precision_recall_curve([0] * len(real_preds) + [1] * len(sample_preds),
|
| 23 |
+
real_preds + sample_preds)
|
| 24 |
+
pr_auc = auc(recall, precision)
|
| 25 |
+
return precision.tolist(), recall.tolist(), float(pr_auc)
|
| 26 |
+
|
model.py
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 7 |
+
import torch
|
| 8 |
+
import time
|
| 9 |
+
import os
|
| 10 |
+
|
| 11 |
+
def from_pretrained(cls, model_name, kwargs, cache_dir):
|
| 12 |
+
# use local model if it exists
|
| 13 |
+
local_path = os.path.join(cache_dir, 'local.' + model_name.replace("/", "_"))
|
| 14 |
+
if os.path.exists(local_path):
|
| 15 |
+
return cls.from_pretrained(local_path, **kwargs)
|
| 16 |
+
return cls.from_pretrained(model_name, **kwargs, cache_dir=cache_dir)
|
| 17 |
+
|
| 18 |
+
# predefined models
|
| 19 |
+
model_fullnames = { 'gpt2': 'gpt2',
|
| 20 |
+
'gpt2-xl': 'gpt2-xl',
|
| 21 |
+
'opt-2.7b': 'facebook/opt-2.7b',
|
| 22 |
+
'gpt-neo-2.7B': 'EleutherAI/gpt-neo-2.7B',
|
| 23 |
+
'gpt-j-6B': 'EleutherAI/gpt-j-6B',
|
| 24 |
+
'gpt-neox-20b': 'EleutherAI/gpt-neox-20b',
|
| 25 |
+
'mgpt': 'sberbank-ai/mGPT',
|
| 26 |
+
'pubmedgpt': 'stanford-crfm/pubmedgpt',
|
| 27 |
+
'mt5-xl': 'google/mt5-xl',
|
| 28 |
+
'llama-13b': 'huggyllama/llama-13b',
|
| 29 |
+
'llama2-13b': 'TheBloke/Llama-2-13B-fp16',
|
| 30 |
+
'bloom-7b1': 'bigscience/bloom-7b1',
|
| 31 |
+
'opt-13b': 'facebook/opt-13b',
|
| 32 |
+
}
|
| 33 |
+
float16_models = ['gpt-j-6B', 'gpt-neox-20b', 'llama-13b', 'llama2-13b', 'bloom-7b1', 'opt-13b']
|
| 34 |
+
|
| 35 |
+
def get_model_fullname(model_name):
|
| 36 |
+
return model_fullnames[model_name] if model_name in model_fullnames else model_name
|
| 37 |
+
|
| 38 |
+
def load_model(model_name, device, cache_dir):
|
| 39 |
+
model_fullname = get_model_fullname(model_name)
|
| 40 |
+
print(f'Loading model {model_fullname}...')
|
| 41 |
+
model_kwargs = {}
|
| 42 |
+
if model_name in float16_models:
|
| 43 |
+
model_kwargs.update(dict(torch_dtype=torch.float16))
|
| 44 |
+
if 'gpt-j' in model_name:
|
| 45 |
+
model_kwargs.update(dict(revision='float16'))
|
| 46 |
+
model = from_pretrained(AutoModelForCausalLM, model_fullname, model_kwargs, cache_dir)
|
| 47 |
+
print('Moving model to GPU...', end='', flush=True)
|
| 48 |
+
start = time.time()
|
| 49 |
+
model.to(device)
|
| 50 |
+
print(f'DONE ({time.time() - start:.2f}s)')
|
| 51 |
+
return model
|
| 52 |
+
|
| 53 |
+
def load_tokenizer(model_name, for_dataset, cache_dir):
|
| 54 |
+
model_fullname = get_model_fullname(model_name)
|
| 55 |
+
optional_tok_kwargs = {}
|
| 56 |
+
if "facebook/opt-" in model_fullname:
|
| 57 |
+
print("Using non-fast tokenizer for OPT")
|
| 58 |
+
optional_tok_kwargs['fast'] = False
|
| 59 |
+
if for_dataset in ['pubmed']:
|
| 60 |
+
optional_tok_kwargs['padding_side'] = 'left'
|
| 61 |
+
else:
|
| 62 |
+
optional_tok_kwargs['padding_side'] = 'right'
|
| 63 |
+
base_tokenizer = from_pretrained(AutoTokenizer, model_fullname, optional_tok_kwargs, cache_dir=cache_dir)
|
| 64 |
+
if base_tokenizer.pad_token_id is None:
|
| 65 |
+
base_tokenizer.pad_token_id = base_tokenizer.eos_token_id
|
| 66 |
+
if '13b' in model_fullname:
|
| 67 |
+
base_tokenizer.pad_token_id = 0
|
| 68 |
+
return base_tokenizer
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
if __name__ == '__main__':
|
| 72 |
+
import argparse
|
| 73 |
+
parser = argparse.ArgumentParser()
|
| 74 |
+
parser.add_argument('--model_name', type=str, default="bloom-7b1")
|
| 75 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 76 |
+
args = parser.parse_args()
|
| 77 |
+
|
| 78 |
+
load_tokenizer(args.model_name, 'xsum', args.cache_dir)
|
| 79 |
+
load_model(args.model_name, 'cpu', args.cache_dir)
|
paraphrasing.py
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import random
|
| 2 |
+
|
| 3 |
+
import torch
|
| 4 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
| 5 |
+
import numpy as np
|
| 6 |
+
import nltk
|
| 7 |
+
from data_builder import load_data, save_data
|
| 8 |
+
from model import from_pretrained
|
| 9 |
+
|
| 10 |
+
class T5Paraphraser:
|
| 11 |
+
def __init__(self, args):
|
| 12 |
+
self.device = args.device
|
| 13 |
+
self.tokenizer = from_pretrained(AutoTokenizer, args.t5_model_name, {}, args.cache_dir)
|
| 14 |
+
self.model = from_pretrained(AutoModelForSeq2SeqLM, args.t5_model_name, {}, args.cache_dir)
|
| 15 |
+
self.model = self.model.to(args.device)
|
| 16 |
+
self.model.eval()
|
| 17 |
+
|
| 18 |
+
def paraphrase(self, sents):
|
| 19 |
+
parabatch = ["paraphrase: " + sent + " </s>" for sent in sents]
|
| 20 |
+
encoding = self.tokenizer(parabatch, padding=True, return_tensors="pt")
|
| 21 |
+
input_ids, attention_masks = encoding["input_ids"].to(self.device), encoding["attention_mask"].to(self.device)
|
| 22 |
+
outputs = self.model.generate(
|
| 23 |
+
input_ids=input_ids, attention_mask=attention_masks,
|
| 24 |
+
max_length=256,
|
| 25 |
+
do_sample=True,
|
| 26 |
+
top_k=200,
|
| 27 |
+
top_p=0.95,
|
| 28 |
+
early_stopping=True,
|
| 29 |
+
num_return_sequences=1
|
| 30 |
+
)
|
| 31 |
+
assert len(sents) == len(outputs)
|
| 32 |
+
results = []
|
| 33 |
+
for output, sent in zip(outputs, sents):
|
| 34 |
+
line = self.tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
|
| 35 |
+
line = line.strip()
|
| 36 |
+
line = line if len(line) > 0 else sent
|
| 37 |
+
results.append(line)
|
| 38 |
+
return results
|
| 39 |
+
|
| 40 |
+
class RandomParaphraser:
|
| 41 |
+
def __init__(self, args):
|
| 42 |
+
self.device = args.device
|
| 43 |
+
|
| 44 |
+
def paraphrase(self, sents):
|
| 45 |
+
results = []
|
| 46 |
+
for sent in sents:
|
| 47 |
+
words = sent.split()
|
| 48 |
+
if len(words) > 20:
|
| 49 |
+
idx = random.randint(0, len(words) - 2)
|
| 50 |
+
words[idx], words[idx+1] = words[idx+1], words[idx]
|
| 51 |
+
results.append(' '.join(words))
|
| 52 |
+
return results
|
| 53 |
+
|
| 54 |
+
def generate_data(args):
|
| 55 |
+
data = load_data(args.dataset_file)
|
| 56 |
+
originals = data['original']
|
| 57 |
+
samples = data['sampled']
|
| 58 |
+
print(f"Total number of samples: {len(samples)}")
|
| 59 |
+
print(f"Average number of words: {np.mean([len(x.split()) for x in samples])}")
|
| 60 |
+
|
| 61 |
+
if args.do_random_para:
|
| 62 |
+
print(f'Using random paraphraser.')
|
| 63 |
+
paraphraser = RandomParaphraser(args)
|
| 64 |
+
else:
|
| 65 |
+
print(f'Loading model {args.t5_model_name}...')
|
| 66 |
+
paraphraser = T5Paraphraser(args)
|
| 67 |
+
|
| 68 |
+
new_samples = []
|
| 69 |
+
for sample in tqdm(samples):
|
| 70 |
+
lines = sample.split('\n')
|
| 71 |
+
new_lines = []
|
| 72 |
+
for line in lines:
|
| 73 |
+
line = line.strip()
|
| 74 |
+
if len(line) == 0:
|
| 75 |
+
new_lines.append(line)
|
| 76 |
+
else:
|
| 77 |
+
sents = nltk.sent_tokenize(line)
|
| 78 |
+
new_sents = paraphraser.paraphrase(sents)
|
| 79 |
+
new_lines.append(' '.join(new_sents))
|
| 80 |
+
new_samples.append('\n'.join(new_lines))
|
| 81 |
+
|
| 82 |
+
new_data = {'original': originals, 'sampled': new_samples}
|
| 83 |
+
save_data(args.output_file, args, new_data)
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
if __name__ == '__main__':
|
| 87 |
+
import argparse
|
| 88 |
+
from tqdm import tqdm
|
| 89 |
+
parser = argparse.ArgumentParser()
|
| 90 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 91 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 92 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
| 93 |
+
parser.add_argument('--t5_model_name', type=str, default="Vamsi/T5_Paraphrase_Paws")
|
| 94 |
+
parser.add_argument('--paraphraser', type=str, default="t5", choices=["t5", "random"])
|
| 95 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 96 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 97 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 98 |
+
args = parser.parse_args()
|
| 99 |
+
|
| 100 |
+
torch.manual_seed(args.seed)
|
| 101 |
+
np.random.seed(args.seed)
|
| 102 |
+
|
| 103 |
+
import nltk
|
| 104 |
+
nltk.download('punkt')
|
| 105 |
+
|
| 106 |
+
generate_data(args)
|
report_results.py
ADDED
|
@@ -0,0 +1,490 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
import os.path
|
| 6 |
+
import argparse
|
| 7 |
+
import json
|
| 8 |
+
import numpy as np
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
def save_lines(lines, file):
|
| 12 |
+
with open(file, 'w') as fout:
|
| 13 |
+
fout.write('\n'.join(lines))
|
| 14 |
+
|
| 15 |
+
def get_auroc(result_file):
|
| 16 |
+
with open(result_file, 'r') as fin:
|
| 17 |
+
res = json.load(fin)
|
| 18 |
+
return res['metrics']['roc_auc']
|
| 19 |
+
|
| 20 |
+
def get_fpr_tpr(result_file):
|
| 21 |
+
with open(result_file, 'r') as fin:
|
| 22 |
+
res = json.load(fin)
|
| 23 |
+
return res['metrics']['fpr'], res['metrics']['tpr']
|
| 24 |
+
|
| 25 |
+
def report_main_results(args):
|
| 26 |
+
datasets = {'xsum': 'XSum',
|
| 27 |
+
'squad': 'SQuAD',
|
| 28 |
+
'writing': 'WritingPrompts'}
|
| 29 |
+
source_models = {'gpt2-xl': 'GPT-2',
|
| 30 |
+
'opt-2.7b': 'OPT-2.7',
|
| 31 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 32 |
+
'gpt-j-6B': 'GPT-J',
|
| 33 |
+
'gpt-neox-20b': 'NeoX'}
|
| 34 |
+
methods1 = {'likelihood': 'Likelihood',
|
| 35 |
+
'entropy': 'Entropy',
|
| 36 |
+
'logrank': 'LogRank',
|
| 37 |
+
'lrr': 'LRR',
|
| 38 |
+
'npr': 'NPR'}
|
| 39 |
+
methods2 = {'perturbation_100': 'DetectGPT',
|
| 40 |
+
'sampling_discrepancy': 'Fast-DetectGPT'}
|
| 41 |
+
|
| 42 |
+
def _get_method_aurocs(dataset, method, filter=''):
|
| 43 |
+
cols = []
|
| 44 |
+
for model in source_models:
|
| 45 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
| 46 |
+
if os.path.exists(result_file):
|
| 47 |
+
auroc = get_auroc(result_file)
|
| 48 |
+
else:
|
| 49 |
+
auroc = 0.0
|
| 50 |
+
cols.append(auroc)
|
| 51 |
+
cols.append(np.mean(cols))
|
| 52 |
+
return cols
|
| 53 |
+
|
| 54 |
+
headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
|
| 55 |
+
for dataset in datasets:
|
| 56 |
+
print('----')
|
| 57 |
+
print(datasets[dataset])
|
| 58 |
+
print('----')
|
| 59 |
+
print(' '.join(headers))
|
| 60 |
+
# basic methods
|
| 61 |
+
for method in methods1:
|
| 62 |
+
method_name = methods1[method]
|
| 63 |
+
cols = _get_method_aurocs(dataset, method)
|
| 64 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 65 |
+
print(method_name, ' '.join(cols))
|
| 66 |
+
# white-box comparison
|
| 67 |
+
results = {}
|
| 68 |
+
for method in methods2:
|
| 69 |
+
method_name = methods2[method]
|
| 70 |
+
cols = _get_method_aurocs(dataset, method)
|
| 71 |
+
results[method_name] = cols
|
| 72 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 73 |
+
print(method_name, ' '.join(cols))
|
| 74 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
| 75 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 76 |
+
print('(Diff)', ' '.join(cols))
|
| 77 |
+
# black-box comparison
|
| 78 |
+
filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
|
| 79 |
+
'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
|
| 80 |
+
results = {}
|
| 81 |
+
for method in methods2:
|
| 82 |
+
method_name = methods2[method]
|
| 83 |
+
cols = _get_method_aurocs(dataset, method, filters[method])
|
| 84 |
+
results[method_name] = cols
|
| 85 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 86 |
+
print(method_name, ' '.join(cols))
|
| 87 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
| 88 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 89 |
+
print('(Diff)', ' '.join(cols))
|
| 90 |
+
|
| 91 |
+
def report_main_ext_results(args):
|
| 92 |
+
datasets = {'xsum': 'XSum',
|
| 93 |
+
'squad': 'SQuAD',
|
| 94 |
+
'writing': 'WritingPrompts'}
|
| 95 |
+
source_models = {'bloom-7b1': 'BLOOM-7.1',
|
| 96 |
+
'opt-13b': 'OPT-13',
|
| 97 |
+
'llama-13b': 'Llama-13',
|
| 98 |
+
'llama2-13b': 'Llama2-13',
|
| 99 |
+
}
|
| 100 |
+
methods1 = {'likelihood': 'Likelihood',
|
| 101 |
+
'entropy': 'Entropy',
|
| 102 |
+
'logrank': 'LogRank',
|
| 103 |
+
'lrr': 'LRR',
|
| 104 |
+
'npr': 'NPR'}
|
| 105 |
+
methods2 = {'perturbation_100': 'DetectGPT',
|
| 106 |
+
'sampling_discrepancy': 'Fast-DetectGPT'}
|
| 107 |
+
|
| 108 |
+
def _get_method_aurocs(dataset, method, filter=''):
|
| 109 |
+
cols = []
|
| 110 |
+
for model in source_models:
|
| 111 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
| 112 |
+
if os.path.exists(result_file):
|
| 113 |
+
auroc = get_auroc(result_file)
|
| 114 |
+
else:
|
| 115 |
+
auroc = 0.0
|
| 116 |
+
cols.append(auroc)
|
| 117 |
+
cols.append(np.mean(cols))
|
| 118 |
+
return cols
|
| 119 |
+
|
| 120 |
+
headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
|
| 121 |
+
for dataset in datasets:
|
| 122 |
+
print('----')
|
| 123 |
+
print(datasets[dataset])
|
| 124 |
+
print('----')
|
| 125 |
+
print(' '.join(headers))
|
| 126 |
+
# basic methods
|
| 127 |
+
for method in methods1:
|
| 128 |
+
method_name = methods1[method]
|
| 129 |
+
cols = _get_method_aurocs(dataset, method)
|
| 130 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 131 |
+
print(method_name, ' '.join(cols))
|
| 132 |
+
# white-box comparison
|
| 133 |
+
results = {}
|
| 134 |
+
for method in methods2:
|
| 135 |
+
method_name = methods2[method]
|
| 136 |
+
cols = _get_method_aurocs(dataset, method)
|
| 137 |
+
results[method_name] = cols
|
| 138 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 139 |
+
print(method_name, ' '.join(cols))
|
| 140 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
| 141 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 142 |
+
print('(Diff)', ' '.join(cols))
|
| 143 |
+
# black-box comparison
|
| 144 |
+
filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
|
| 145 |
+
'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
|
| 146 |
+
results = {}
|
| 147 |
+
for method in methods2:
|
| 148 |
+
method_name = methods2[method]
|
| 149 |
+
cols = _get_method_aurocs(dataset, method, filters[method])
|
| 150 |
+
results[method_name] = cols
|
| 151 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 152 |
+
print(method_name, ' '.join(cols))
|
| 153 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
| 154 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 155 |
+
print('(Diff)', ' '.join(cols))
|
| 156 |
+
|
| 157 |
+
def report_refmodel_results(args):
|
| 158 |
+
datasets = {'xsum': 'XSum',
|
| 159 |
+
'squad': 'SQuAD',
|
| 160 |
+
'writing': 'WritingPrompts'}
|
| 161 |
+
source_models = {'gpt2-xl': 'GPT-2',
|
| 162 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 163 |
+
'gpt-j-6B': 'GPT-J'}
|
| 164 |
+
|
| 165 |
+
def _get_method_aurocs(method, ref_model=None):
|
| 166 |
+
cols = []
|
| 167 |
+
for dataset in datasets:
|
| 168 |
+
for model in source_models:
|
| 169 |
+
filter = '' if ref_model is None or ref_model == model else f'.{ref_model}_{model}'
|
| 170 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
| 171 |
+
if os.path.exists(result_file):
|
| 172 |
+
auroc = get_auroc(result_file)
|
| 173 |
+
else:
|
| 174 |
+
auroc = 0.0
|
| 175 |
+
cols.append(auroc)
|
| 176 |
+
cols.append(np.mean(cols))
|
| 177 |
+
return cols
|
| 178 |
+
|
| 179 |
+
headers1 = ['----'] + list([datasets[d] for d in datasets])
|
| 180 |
+
headers2 = ['Method'] + [source_models[model] for model in source_models] \
|
| 181 |
+
+ [source_models[model] for model in source_models] \
|
| 182 |
+
+ [source_models[model] for model in source_models] \
|
| 183 |
+
+ ['Avg.']
|
| 184 |
+
print(' '.join(headers1))
|
| 185 |
+
print(' '.join(headers2))
|
| 186 |
+
|
| 187 |
+
ref_models = [None, 'gpt2-xl', 'gpt-neo-2.7B', 'gpt-j-6B']
|
| 188 |
+
for ref_model in ref_models:
|
| 189 |
+
method = 'sampling_discrepancy'
|
| 190 |
+
method_name = 'Fast-DetectGPT (*/*)' if ref_model is None else f'Fast-DetectGPT ({source_models[ref_model]}/*)'
|
| 191 |
+
cols = _get_method_aurocs(method, ref_model)
|
| 192 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 193 |
+
print(method_name, ' '.join(cols))
|
| 194 |
+
|
| 195 |
+
|
| 196 |
+
def report_chatgpt_gpt4_results(args):
|
| 197 |
+
datasets = {'xsum': 'XSum',
|
| 198 |
+
'writing': 'Writing',
|
| 199 |
+
'pubmed': 'PubMed'}
|
| 200 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
| 201 |
+
'gpt-4': 'GPT-4'}
|
| 202 |
+
score_models = { 't5-11b': 'T5-11B',
|
| 203 |
+
'gpt2-xl': 'GPT-2',
|
| 204 |
+
'opt-2.7b': 'OPT-2.7',
|
| 205 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 206 |
+
'gpt-j-6B': 'GPT-J',
|
| 207 |
+
'gpt-neox-20b': 'NeoX'}
|
| 208 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
| 209 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
| 210 |
+
methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
|
| 211 |
+
methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
|
| 212 |
+
'sampling_discrepancy_analytic': 'Fast'}
|
| 213 |
+
|
| 214 |
+
def _get_method_aurocs(method, filter=''):
|
| 215 |
+
results = []
|
| 216 |
+
for model in source_models:
|
| 217 |
+
cols = []
|
| 218 |
+
for dataset in datasets:
|
| 219 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
| 220 |
+
if os.path.exists(result_file):
|
| 221 |
+
auroc = get_auroc(result_file)
|
| 222 |
+
else:
|
| 223 |
+
auroc = 0.0
|
| 224 |
+
cols.append(auroc)
|
| 225 |
+
cols.append(np.mean(cols))
|
| 226 |
+
results.extend(cols)
|
| 227 |
+
return results
|
| 228 |
+
|
| 229 |
+
headers1 = ['--'] + [source_models[model] for model in source_models]
|
| 230 |
+
headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
|
| 231 |
+
+ [datasets[dataset] for dataset in datasets] + ['Avg.']
|
| 232 |
+
print(' '.join(headers1))
|
| 233 |
+
print(' '.join(headers2))
|
| 234 |
+
# supervised methods
|
| 235 |
+
for method in methods1:
|
| 236 |
+
method_name = methods1[method]
|
| 237 |
+
cols = _get_method_aurocs(method)
|
| 238 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 239 |
+
print(method_name, ' '.join(cols))
|
| 240 |
+
# zero-shot methods
|
| 241 |
+
|
| 242 |
+
filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
| 243 |
+
'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
| 244 |
+
'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
|
| 245 |
+
filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 246 |
+
'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 247 |
+
'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 248 |
+
'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
|
| 249 |
+
for method in methods2:
|
| 250 |
+
for filter in filters2[method]:
|
| 251 |
+
setting = score_models[filter[1:]]
|
| 252 |
+
method_name = f'{methods2[method]}({setting})'
|
| 253 |
+
cols = _get_method_aurocs(method, filter)
|
| 254 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 255 |
+
print(method_name, ' '.join(cols))
|
| 256 |
+
for method in methods3:
|
| 257 |
+
for filter in filters3[method]:
|
| 258 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
| 259 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
| 260 |
+
cols = _get_method_aurocs(method, filter)
|
| 261 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 262 |
+
print(method_name, ' '.join(cols))
|
| 263 |
+
|
| 264 |
+
def report_gpt3_results(args):
|
| 265 |
+
datasets = {'xsum': 'XSum',
|
| 266 |
+
'writing': 'Writing',
|
| 267 |
+
'pubmed': 'PubMed'}
|
| 268 |
+
source_models = {'davinci': 'GPT-3'}
|
| 269 |
+
score_models = { 't5-11b': 'T5-11B',
|
| 270 |
+
'gpt2-xl': 'GPT-2',
|
| 271 |
+
'opt-2.7b': 'OPT-2.7',
|
| 272 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 273 |
+
'gpt-j-6B': 'GPT-J',
|
| 274 |
+
'gpt-neox-20b': 'NeoX'}
|
| 275 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
| 276 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
| 277 |
+
methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
|
| 278 |
+
methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
|
| 279 |
+
'sampling_discrepancy_analytic': 'Fast'}
|
| 280 |
+
|
| 281 |
+
def _get_method_aurocs(method, filter=''):
|
| 282 |
+
results = []
|
| 283 |
+
for model in source_models:
|
| 284 |
+
cols = []
|
| 285 |
+
for dataset in datasets:
|
| 286 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
| 287 |
+
if os.path.exists(result_file):
|
| 288 |
+
auroc = get_auroc(result_file)
|
| 289 |
+
else:
|
| 290 |
+
auroc = 0.0
|
| 291 |
+
cols.append(auroc)
|
| 292 |
+
cols.append(np.mean(cols))
|
| 293 |
+
results.extend(cols)
|
| 294 |
+
return results
|
| 295 |
+
|
| 296 |
+
headers1 = ['--'] + [source_models[model] for model in source_models]
|
| 297 |
+
headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
|
| 298 |
+
+ [datasets[dataset] for dataset in datasets] + ['Avg.']
|
| 299 |
+
print(' '.join(headers1))
|
| 300 |
+
print(' '.join(headers2))
|
| 301 |
+
# supervised methods
|
| 302 |
+
for method in methods1:
|
| 303 |
+
method_name = methods1[method]
|
| 304 |
+
cols = _get_method_aurocs(method)
|
| 305 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 306 |
+
print(method_name, ' '.join(cols))
|
| 307 |
+
# zero-shot methods
|
| 308 |
+
|
| 309 |
+
filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
| 310 |
+
'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
| 311 |
+
'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
|
| 312 |
+
filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 313 |
+
'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 314 |
+
'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
| 315 |
+
'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
|
| 316 |
+
for method in methods2:
|
| 317 |
+
for filter in filters2[method]:
|
| 318 |
+
setting = score_models[filter[1:]]
|
| 319 |
+
method_name = f'{methods2[method]}({setting})'
|
| 320 |
+
cols = _get_method_aurocs(method, filter)
|
| 321 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 322 |
+
print(method_name, ' '.join(cols))
|
| 323 |
+
for method in methods3:
|
| 324 |
+
for filter in filters3[method]:
|
| 325 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
| 326 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
| 327 |
+
cols = _get_method_aurocs(method, filter)
|
| 328 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 329 |
+
print(method_name, ' '.join(cols))
|
| 330 |
+
|
| 331 |
+
def report_maxlen_trends(args):
|
| 332 |
+
datasets = {'xsum': 'XSum',
|
| 333 |
+
'writing': 'WritingPrompts'}
|
| 334 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
| 335 |
+
'gpt-4': 'GPT-4'}
|
| 336 |
+
score_models = {'t5-11b': 'T5-11B',
|
| 337 |
+
'gpt2-xl': 'GPT-2',
|
| 338 |
+
'opt-2.7b': 'OPT-2.7',
|
| 339 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 340 |
+
'gpt-j-6B': 'GPT-J',
|
| 341 |
+
'gpt-neox-20b': 'NeoX'}
|
| 342 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
| 343 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
| 344 |
+
methods2 = {'likelihood': 'Likelihood'}
|
| 345 |
+
methods3 = {'perturbation_100': 'DetectGPT',
|
| 346 |
+
'sampling_discrepancy_analytic': 'Fast-Detect'}
|
| 347 |
+
maxlens = [30, 60, 90, 120, 150, 180]
|
| 348 |
+
|
| 349 |
+
def _get_method_aurocs(root_path, dataset, source_model, method, filter=''):
|
| 350 |
+
cols = []
|
| 351 |
+
for maxlen in maxlens:
|
| 352 |
+
result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
|
| 353 |
+
if os.path.exists(result_file):
|
| 354 |
+
auroc = get_auroc(result_file)
|
| 355 |
+
else:
|
| 356 |
+
auroc = 0.0
|
| 357 |
+
cols.append(auroc)
|
| 358 |
+
return cols
|
| 359 |
+
|
| 360 |
+
filters2 = {'likelihood': '.gpt-neo-2.7B'}
|
| 361 |
+
filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
|
| 362 |
+
'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
|
| 363 |
+
|
| 364 |
+
headers = ['Method'] + [str(maxlen) for maxlen in maxlens]
|
| 365 |
+
print(' '.join(headers))
|
| 366 |
+
# print table per model and dataset
|
| 367 |
+
results = {}
|
| 368 |
+
for model in source_models:
|
| 369 |
+
model_name = source_models[model]
|
| 370 |
+
for data in datasets:
|
| 371 |
+
data_name = datasets[data]
|
| 372 |
+
print('----')
|
| 373 |
+
print(f'{model_name} / {data_name}')
|
| 374 |
+
print('----')
|
| 375 |
+
for method in methods1:
|
| 376 |
+
method_name = methods1[method]
|
| 377 |
+
cols = _get_method_aurocs('.', data, model, method)
|
| 378 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 379 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 380 |
+
print(method_name, ' '.join(cols))
|
| 381 |
+
for method in methods2:
|
| 382 |
+
filter = filters2[method]
|
| 383 |
+
setting = score_models[filter[1:]]
|
| 384 |
+
method_name = f'{methods2[method]}({setting})'
|
| 385 |
+
cols = _get_method_aurocs('.', data, model, method, filter)
|
| 386 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 387 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 388 |
+
print(method_name, ' '.join(cols))
|
| 389 |
+
for method in methods3:
|
| 390 |
+
filter = filters3[method]
|
| 391 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
| 392 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
| 393 |
+
cols = _get_method_aurocs('.', data, model, method, filter)
|
| 394 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 395 |
+
cols = [f'{col:.4f}' for col in cols]
|
| 396 |
+
print(method_name, ' '.join(cols))
|
| 397 |
+
import json
|
| 398 |
+
json_file = './exp_analysis/maxlen_trends.json'
|
| 399 |
+
with open(json_file, 'w') as fout:
|
| 400 |
+
json.dump(results, fout)
|
| 401 |
+
print(f'Write to file {json_file}')
|
| 402 |
+
|
| 403 |
+
def report_auroc_curve(args):
|
| 404 |
+
datasets = {'xsum': 'XSum',
|
| 405 |
+
'writing': 'WritingPrompts'}
|
| 406 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
| 407 |
+
'gpt-4': 'GPT-4'}
|
| 408 |
+
score_models = {'t5-11b': 'T5-11B',
|
| 409 |
+
'gpt2-xl': 'GPT-2',
|
| 410 |
+
'opt-2.7b': 'OPT-2.7',
|
| 411 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
| 412 |
+
'gpt-j-6B': 'GPT-J',
|
| 413 |
+
'gpt-neox-20b': 'NeoX'}
|
| 414 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
| 415 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
| 416 |
+
methods2 = {'likelihood': 'Likelihood'}
|
| 417 |
+
methods3 = {'perturbation_100': 'DetectGPT',
|
| 418 |
+
'sampling_discrepancy_analytic': 'Fast-Detect'}
|
| 419 |
+
|
| 420 |
+
def _get_method_fpr_tpr(root_path, dataset, source_model, method, filter=''):
|
| 421 |
+
maxlen = 180
|
| 422 |
+
result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
|
| 423 |
+
if os.path.exists(result_file):
|
| 424 |
+
fpr, tpr = get_fpr_tpr(result_file)
|
| 425 |
+
else:
|
| 426 |
+
fpr, tpr = [], []
|
| 427 |
+
assert len(fpr) == len(tpr)
|
| 428 |
+
return list(zip(fpr, tpr))
|
| 429 |
+
|
| 430 |
+
filters2 = {'likelihood': '.gpt-neo-2.7B'}
|
| 431 |
+
filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
|
| 432 |
+
'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
|
| 433 |
+
|
| 434 |
+
# print table per model and dataset
|
| 435 |
+
results = {}
|
| 436 |
+
for model in source_models:
|
| 437 |
+
model_name = source_models[model]
|
| 438 |
+
for data in datasets:
|
| 439 |
+
data_name = datasets[data]
|
| 440 |
+
print('----')
|
| 441 |
+
print(f'{model_name} / {data_name}')
|
| 442 |
+
print('----')
|
| 443 |
+
for method in methods1:
|
| 444 |
+
method_name = methods1[method]
|
| 445 |
+
cols = _get_method_fpr_tpr('.', data, model, method)
|
| 446 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 447 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
| 448 |
+
print(method_name, ' '.join(cols))
|
| 449 |
+
for method in methods2:
|
| 450 |
+
filter = filters2[method]
|
| 451 |
+
setting = score_models[filter[1:]]
|
| 452 |
+
method_name = f'{methods2[method]}({setting})'
|
| 453 |
+
cols = _get_method_fpr_tpr('.', data, model, method, filter)
|
| 454 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 455 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
| 456 |
+
print(method_name, ' '.join(cols))
|
| 457 |
+
for method in methods3:
|
| 458 |
+
filter = filters3[method]
|
| 459 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
| 460 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
| 461 |
+
cols = _get_method_fpr_tpr('.', data, model, method, filter)
|
| 462 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
| 463 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
| 464 |
+
print(method_name, ' '.join(cols))
|
| 465 |
+
import json
|
| 466 |
+
json_file = './exp_analysis/auroc_curve.json'
|
| 467 |
+
with open(json_file, 'w') as fout:
|
| 468 |
+
json.dump(results, fout)
|
| 469 |
+
print(f'Write to file {json_file}')
|
| 470 |
+
|
| 471 |
+
if __name__ == '__main__':
|
| 472 |
+
parser = argparse.ArgumentParser()
|
| 473 |
+
parser.add_argument('--result_path', type=str, default="./exp_main/results/")
|
| 474 |
+
parser.add_argument('--report_name', type=str, default="main_results")
|
| 475 |
+
args = parser.parse_args()
|
| 476 |
+
|
| 477 |
+
if args.report_name == 'main_results':
|
| 478 |
+
report_main_results(args)
|
| 479 |
+
elif args.report_name == 'main_ext_results':
|
| 480 |
+
report_main_ext_results(args)
|
| 481 |
+
elif args.report_name == 'chatgpt_gpt4_results':
|
| 482 |
+
report_chatgpt_gpt4_results(args)
|
| 483 |
+
elif args.report_name == 'gpt3_results':
|
| 484 |
+
report_gpt3_results(args)
|
| 485 |
+
elif args.report_name == 'maxlen_trends':
|
| 486 |
+
report_maxlen_trends(args)
|
| 487 |
+
elif args.report_name == 'auroc_curve':
|
| 488 |
+
report_auroc_curve(args)
|
| 489 |
+
elif args.report_name == 'refmodel_results':
|
| 490 |
+
report_refmodel_results(args)
|
requirements.txt
CHANGED
|
@@ -1,3 +1,8 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
torch
|
| 2 |
+
numpy
|
| 3 |
+
transformers==4.28.1
|
| 4 |
+
datasets==2.12.0
|
| 5 |
+
matplotlib
|
| 6 |
+
tqdm
|
| 7 |
+
openai
|
| 8 |
+
nltk
|
setup.sh
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
pip install -r requirements.txt
|
show_result.py
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
import matplotlib
|
| 7 |
+
import matplotlib.pyplot as plt
|
| 8 |
+
import argparse
|
| 9 |
+
import glob
|
| 10 |
+
import json
|
| 11 |
+
from os import path
|
| 12 |
+
|
| 13 |
+
import numpy as np
|
| 14 |
+
|
| 15 |
+
matplotlib.use('Agg')
|
| 16 |
+
|
| 17 |
+
# plot histogram of sampled on left, and original on right
|
| 18 |
+
def save_histogram(predictions, figure_file):
|
| 19 |
+
plt.figure(figsize=(4, 2.5))
|
| 20 |
+
plt.subplot(1, 1, 1)
|
| 21 |
+
plt.hist(predictions["samples"], alpha=0.5, bins='auto', label='Model')
|
| 22 |
+
plt.hist(predictions["real"], alpha=0.5, bins='auto', label='Human')
|
| 23 |
+
plt.xlabel("Sampling Discrepancy")
|
| 24 |
+
plt.ylabel('Frequency')
|
| 25 |
+
plt.legend(loc='upper right')
|
| 26 |
+
plt.tight_layout()
|
| 27 |
+
plt.savefig(figure_file)
|
| 28 |
+
|
| 29 |
+
if __name__ == '__main__':
|
| 30 |
+
parser = argparse.ArgumentParser()
|
| 31 |
+
parser.add_argument('--result_files', type=str, default="./exp_test/results/*.json")
|
| 32 |
+
parser.add_argument('--draw', action='store_true')
|
| 33 |
+
args = parser.parse_args()
|
| 34 |
+
|
| 35 |
+
for res_file in glob.glob(args.result_files, recursive=True):
|
| 36 |
+
with open(res_file, 'r') as fin:
|
| 37 |
+
res = json.load(fin)
|
| 38 |
+
if 'metrics' in res:
|
| 39 |
+
n_samples = res['info']['n_samples']
|
| 40 |
+
roc_auc = res['metrics']['roc_auc']
|
| 41 |
+
real = res['predictions']['real']
|
| 42 |
+
samples = res['predictions']['samples']
|
| 43 |
+
print(f"{res_file}: roc_auc={roc_auc:.4f} n_samples={n_samples} r:{np.mean(real):.2f}/{np.std(real):.2f} s:{np.mean(samples):.2f}/{np.std(samples):.2f}")
|
| 44 |
+
else:
|
| 45 |
+
print(f"{res_file}: metrics not found.")
|
| 46 |
+
# draw histogram
|
| 47 |
+
if args.draw:
|
| 48 |
+
fig_file = f"{res_file}.pdf"
|
| 49 |
+
save_histogram(res['predictions'], fig_file)
|
| 50 |
+
print(f"{fig_file}: histogram figure saved.")
|
| 51 |
+
|
supervised.py
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Copyright (c) Guangsheng Bao.
|
| 2 |
+
#
|
| 3 |
+
# This source code is licensed under the MIT license found in the
|
| 4 |
+
# LICENSE file in the root directory of this source tree.
|
| 5 |
+
|
| 6 |
+
import numpy as np
|
| 7 |
+
import torch
|
| 8 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 9 |
+
import tqdm
|
| 10 |
+
import argparse
|
| 11 |
+
import json
|
| 12 |
+
from data_builder import load_data
|
| 13 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
| 14 |
+
from model import from_pretrained
|
| 15 |
+
|
| 16 |
+
def experiment(args):
|
| 17 |
+
# load model
|
| 18 |
+
print(f'Beginning supervised evaluation with {args.model_name}...')
|
| 19 |
+
detector = from_pretrained(AutoModelForSequenceClassification, args.model_name, {}, args.cache_dir).to(args.device)
|
| 20 |
+
tokenizer = from_pretrained(AutoTokenizer, args.model_name, {}, args.cache_dir)
|
| 21 |
+
detector.eval()
|
| 22 |
+
# load data
|
| 23 |
+
data = load_data(args.dataset_file)
|
| 24 |
+
n_samples = len(data["sampled"])
|
| 25 |
+
# eval detector
|
| 26 |
+
name = args.model_name
|
| 27 |
+
torch.manual_seed(args.seed)
|
| 28 |
+
np.random.seed(args.seed)
|
| 29 |
+
eval_results = []
|
| 30 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
| 31 |
+
original_text = data["original"][idx]
|
| 32 |
+
sampled_text = data["sampled"][idx]
|
| 33 |
+
# original text
|
| 34 |
+
tokenized = tokenizer(original_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
|
| 35 |
+
with torch.no_grad():
|
| 36 |
+
original_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
|
| 37 |
+
# sampled text
|
| 38 |
+
tokenized = tokenizer(sampled_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
|
| 39 |
+
with torch.no_grad():
|
| 40 |
+
sampled_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
|
| 41 |
+
# result
|
| 42 |
+
eval_results.append({"original": original_text,
|
| 43 |
+
"original_crit": original_crit,
|
| 44 |
+
"sampled": sampled_text,
|
| 45 |
+
"sampled_crit": sampled_crit})
|
| 46 |
+
|
| 47 |
+
# compute prediction scores for real/sampled passages
|
| 48 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
| 49 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
| 50 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
| 51 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
| 52 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
| 53 |
+
# log results
|
| 54 |
+
results_file = f'{args.output_file}.{name}.json'
|
| 55 |
+
results = { 'name': f'{name}_threshold',
|
| 56 |
+
'info': {'n_samples': n_samples},
|
| 57 |
+
'predictions': predictions,
|
| 58 |
+
'raw_results': eval_results,
|
| 59 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
| 60 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
| 61 |
+
'loss': 1 - pr_auc}
|
| 62 |
+
with open(results_file, 'w') as fout:
|
| 63 |
+
json.dump(results, fout)
|
| 64 |
+
print(f'Results written into {results_file}')
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
if __name__ == '__main__':
|
| 68 |
+
parser = argparse.ArgumentParser()
|
| 69 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
| 70 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
| 71 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
| 72 |
+
parser.add_argument('--model_name', type=str, default="roberta-base-openai-detector")
|
| 73 |
+
parser.add_argument('--seed', type=int, default=0)
|
| 74 |
+
parser.add_argument('--device', type=str, default="cuda")
|
| 75 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
| 76 |
+
args = parser.parse_args()
|
| 77 |
+
|
| 78 |
+
experiment(args)
|
supervised.sh
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_supervised
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
# preparing dataset
|
| 18 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
| 19 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
| 20 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
| 21 |
+
python scripts/data_builder.py --dataset $D --n_samples 200 --base_model_name $M --output_file $data_path/${D}_${M}
|
| 22 |
+
done
|
| 23 |
+
|
| 24 |
+
# evaluate baselines
|
| 25 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
| 26 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
| 27 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 28 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 29 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 30 |
+
done
|
| 31 |
+
|
| 32 |
+
# evaluate supervised detectors
|
| 33 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
| 34 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
| 35 |
+
for SM in roberta-base-openai-detector roberta-large-openai-detector; do
|
| 36 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
| 37 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
| 38 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 39 |
+
done
|
| 40 |
+
done
|
| 41 |
+
|
| 42 |
+
# evaluate DetectGPT
|
| 43 |
+
for P in "english:mgpt:mt5-xl" "german:mgpt:mt5-xl" "pubmed:pubmedgpt:t5-11b" "xsum:gpt2-xl:t5-11b"; do
|
| 44 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M1=${P[1]} && M2=${P[2]}
|
| 45 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M1}_${M2} ...
|
| 46 |
+
python scripts/detect_gpt.py --scoring_model_name $M1 --mask_filling_model_name $M2 --n_perturbations 100 --dataset $D \
|
| 47 |
+
--dataset_file $data_path/${D}_${M1} --output_file $res_path/${D}_${M1}_${M2}
|
| 48 |
+
done
|
| 49 |
+
|
| 50 |
+
# evaluate Fast-DetectGPT
|
| 51 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
| 52 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
| 53 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}-${M} ...
|
| 54 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M \
|
| 55 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 56 |
+
done
|
temperature.sh
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_temperature
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum squad writing"
|
| 18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
for D in $datasets; do
|
| 22 |
+
for M in $source_models; do
|
| 23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
| 24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_temperature --base_model_name $M --output_file $data_path/${D}_${M}
|
| 25 |
+
done
|
| 26 |
+
done
|
| 27 |
+
|
| 28 |
+
# White-box Setting
|
| 29 |
+
echo `date`, Evaluate models in the white-box setting:
|
| 30 |
+
|
| 31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
| 32 |
+
for D in $datasets; do
|
| 33 |
+
for M in $source_models; do
|
| 34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
| 35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
| 36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 37 |
+
|
| 38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 41 |
+
done
|
| 42 |
+
done
|
| 43 |
+
|
| 44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 45 |
+
for D in $datasets; do
|
| 46 |
+
for M in $source_models; do
|
| 47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
| 48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
| 49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 50 |
+
# we leverage DetectGPT to generate the perturbations
|
| 51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
| 52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
| 53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
| 54 |
+
done
|
| 55 |
+
done
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
# Black-box Setting
|
| 59 |
+
echo `date`, Evaluate models in the black-box setting:
|
| 60 |
+
scoring_models="gpt-neo-2.7B"
|
| 61 |
+
|
| 62 |
+
# evaluate Fast-DetectGPT
|
| 63 |
+
for D in $datasets; do
|
| 64 |
+
for M in $source_models; do
|
| 65 |
+
M1=gpt-j-6B # sampling model
|
| 66 |
+
for M2 in $scoring_models; do
|
| 67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
| 69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 70 |
+
done
|
| 71 |
+
done
|
| 72 |
+
done
|
| 73 |
+
|
| 74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 75 |
+
for D in $datasets; do
|
| 76 |
+
for M in $source_models; do
|
| 77 |
+
M1=t5-3b # perturbation model
|
| 78 |
+
for M2 in $scoring_models; do
|
| 79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 82 |
+
# we leverage DetectGPT to generate the perturbations
|
| 83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 86 |
+
done
|
| 87 |
+
done
|
| 88 |
+
done
|
topk.sh
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_topk
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum squad writing"
|
| 18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
for D in $datasets; do
|
| 22 |
+
for M in $source_models; do
|
| 23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
| 24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_k --base_model_name $M --output_file $data_path/${D}_${M}
|
| 25 |
+
done
|
| 26 |
+
done
|
| 27 |
+
|
| 28 |
+
# White-box Setting
|
| 29 |
+
echo `date`, Evaluate models in the white-box setting:
|
| 30 |
+
|
| 31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
| 32 |
+
for D in $datasets; do
|
| 33 |
+
for M in $source_models; do
|
| 34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
| 35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
| 36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 37 |
+
|
| 38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 41 |
+
done
|
| 42 |
+
done
|
| 43 |
+
|
| 44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 45 |
+
for D in $datasets; do
|
| 46 |
+
for M in $source_models; do
|
| 47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
| 48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
| 49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 50 |
+
# we leverage DetectGPT to generate the perturbations
|
| 51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
| 52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
| 53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
| 54 |
+
done
|
| 55 |
+
done
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
# Black-box Setting
|
| 59 |
+
echo `date`, Evaluate models in the black-box setting:
|
| 60 |
+
scoring_models="gpt-neo-2.7B"
|
| 61 |
+
|
| 62 |
+
# evaluate Fast-DetectGPT
|
| 63 |
+
for D in $datasets; do
|
| 64 |
+
for M in $source_models; do
|
| 65 |
+
M1=gpt-j-6B # sampling model
|
| 66 |
+
for M2 in $scoring_models; do
|
| 67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
| 69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 70 |
+
done
|
| 71 |
+
done
|
| 72 |
+
done
|
| 73 |
+
|
| 74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 75 |
+
for D in $datasets; do
|
| 76 |
+
for M in $source_models; do
|
| 77 |
+
M1=t5-3b # perturbation model
|
| 78 |
+
for M2 in $scoring_models; do
|
| 79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 82 |
+
# we leverage DetectGPT to generate the perturbations
|
| 83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 86 |
+
done
|
| 87 |
+
done
|
| 88 |
+
done
|
topp.sh
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Copyright (c) Guangsheng Bao.
|
| 3 |
+
#
|
| 4 |
+
# This source code is licensed under the MIT license found in the
|
| 5 |
+
# LICENSE file in the root directory of this source tree.
|
| 6 |
+
|
| 7 |
+
# setup the environment
|
| 8 |
+
echo `date`, Setup the environment ...
|
| 9 |
+
set -e # exit if error
|
| 10 |
+
|
| 11 |
+
# prepare folders
|
| 12 |
+
exp_path=exp_topp
|
| 13 |
+
data_path=$exp_path/data
|
| 14 |
+
res_path=$exp_path/results
|
| 15 |
+
mkdir -p $exp_path $data_path $res_path
|
| 16 |
+
|
| 17 |
+
datasets="xsum squad writing"
|
| 18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
| 19 |
+
|
| 20 |
+
# preparing dataset
|
| 21 |
+
for D in $datasets; do
|
| 22 |
+
for M in $source_models; do
|
| 23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
| 24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_p --base_model_name $M --output_file $data_path/${D}_${M}
|
| 25 |
+
done
|
| 26 |
+
done
|
| 27 |
+
|
| 28 |
+
# White-box Setting
|
| 29 |
+
echo `date`, Evaluate models in the white-box setting:
|
| 30 |
+
|
| 31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
| 32 |
+
for D in $datasets; do
|
| 33 |
+
for M in $source_models; do
|
| 34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
| 35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
| 36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 37 |
+
|
| 38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
| 39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
| 40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 41 |
+
done
|
| 42 |
+
done
|
| 43 |
+
|
| 44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 45 |
+
for D in $datasets; do
|
| 46 |
+
for M in $source_models; do
|
| 47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
| 48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
| 49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
| 50 |
+
# we leverage DetectGPT to generate the perturbations
|
| 51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
| 52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
| 53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
| 54 |
+
done
|
| 55 |
+
done
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
# Black-box Setting
|
| 59 |
+
echo `date`, Evaluate models in the black-box setting:
|
| 60 |
+
scoring_models="gpt-neo-2.7B"
|
| 61 |
+
|
| 62 |
+
# evaluate Fast-DetectGPT
|
| 63 |
+
for D in $datasets; do
|
| 64 |
+
for M in $source_models; do
|
| 65 |
+
M1=gpt-j-6B # sampling model
|
| 66 |
+
for M2 in $scoring_models; do
|
| 67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
| 69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 70 |
+
done
|
| 71 |
+
done
|
| 72 |
+
done
|
| 73 |
+
|
| 74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
| 75 |
+
for D in $datasets; do
|
| 76 |
+
for M in $source_models; do
|
| 77 |
+
M1=t5-3b # perturbation model
|
| 78 |
+
for M2 in $scoring_models; do
|
| 79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
| 80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
| 81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 82 |
+
# we leverage DetectGPT to generate the perturbations
|
| 83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
| 84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
| 85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
| 86 |
+
done
|
| 87 |
+
done
|
| 88 |
+
done
|