Reverse Prompter

A fine-tuned google/gemma-3-270m model that reconstructs the most likely prompt from an AI assistant's response.

Given an AI-generated text, the model generates candidate prompts that could have produced it.

How It Works

The model was trained on prompt-response pairs formatted as:

{response}\n###\n{prompt}

At inference time, you provide the response followed by the \n###\n separator, and the model generates the reconstructed prompt.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("dejanseo/reverse-prompter", torch_dtype="bfloat16").cuda().eval()
tokenizer = AutoTokenizer.from_pretrained("dejanseo/reverse-prompter")

response_text = "Your AI-generated text here"
prompt = response_text.strip() + "\n###\n"

inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, penalty_alpha=0.3, top_k=4)

generated = outputs[0][inputs["input_ids"].shape[-1]:]
reconstructed_prompt = tokenizer.decode(generated, skip_special_tokens=True).strip()
print(reconstructed_prompt)

For best results, run generation across multiple contrastive search configurations and rank outputs by perplexity. See the companion Streamlit app for a full implementation.

Training Data

The training dataset was generated synthetically using Gemini 2.5 Flash via Vertex AI in a three-stage pipeline:

1. Prompt Generation

100,000 diverse prompts were generated across five categories (20 each per batch):

Mid-tail, search query style (single or multi-faceted)
Long-tail, search query style (multi-faceted)
Simple, prompt-like (single-faceted)
Typical, prompt-like (single or multi-faceted)
Detailed, prompt-like (multi-faceted)

Generation was parallelized with 100 concurrent API calls in batches of 100 prompts, with results stored in SQLite.

2. Response Generation

Each prompt was sent back to Gemini 2.5 Flash (with thinking disabled) to produce a corresponding AI assistant response. This was also parallelized at 100 concurrent calls.

3. Tokenization

Prompt-response pairs were formatted as {response}\n###\n{prompt}<eos> and tokenized using the Gemma 3 tokenizer. Labels were masked (-100) over the response and separator tokens so the model only learns to predict the prompt portion. Tokenization was done in batches of 5,000 and concatenated into the final dataset.

Training Details

Parameter	Value
Base model	google/gemma-3-270m
Method	Full fine-tune
Precision	bfloat16
Epochs	1
Batch size	2
Gradient accumulation	8 (effective batch size 16)
Learning rate	5e-5
Warmup steps	100
Max sequence length	2048
Optimizer	AdamW (torch fused)
Gradient checkpointing	Enabled
Training time	4h 14m
GPU	NVIDIA GeForce RTX 4090 (24 GB)
CPU	AMD Ryzen 9 7950X3D 16-Core
RAM	128 GB

Training Loss

Inference Strategy

The companion app uses contrastive search with a sweep over configurations:

top_k: [2, 4, 6, 15]
penalty_alpha: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]

This produces up to 24 candidate prompts per input. Candidates are deduplicated and ranked by perplexity (lower is better). Token-level probabilities provide a confidence signal for each word in the reconstruction.

Limitations

Prompt reconstruction is inherently probabilistic. The model returns plausible prompts, not necessarily the exact original.
Performance is best on responses typical of AI assistants. Non-standard or very short inputs may produce lower-quality reconstructions.
The model inherits the capabilities and limitations of the gemma-3-270m base model.

Author

Dejan AI

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for dejanseo/reverse-prompter

Base model

google/gemma-3-270m

Finetuned

(134)

this model

dejanseo
/

reverse-prompter