Reverse Prompter

A fine-tuned google/gemma-3-270m model that reconstructs the most likely prompt from an AI assistant's response.

Given an AI-generated text, the model generates candidate prompts that could have produced it.

How It Works

The model was trained on prompt-response pairs formatted as:

{response}\n###\n{prompt}

At inference time, you provide the response followed by the \n###\n separator, and the model generates the reconstructed prompt.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("dejanseo/reverse-prompter", torch_dtype="bfloat16").cuda().eval()
tokenizer = AutoTokenizer.from_pretrained("dejanseo/reverse-prompter")

response_text = "Your AI-generated text here"
prompt = response_text.strip() + "\n###\n"

inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, penalty_alpha=0.3, top_k=4)

generated = outputs[0][inputs["input_ids"].shape[-1]:]
reconstructed_prompt = tokenizer.decode(generated, skip_special_tokens=True).strip()
print(reconstructed_prompt)

For best results, run generation across multiple contrastive search configurations and rank outputs by perplexity. See the companion Streamlit app for a full implementation.

Training Data

The training dataset was generated synthetically using Gemini 2.5 Flash via Vertex AI in a three-stage pipeline:

1. Prompt Generation

100,000 diverse prompts were generated across five categories (20 each per batch):

  • Mid-tail, search query style (single or multi-faceted)
  • Long-tail, search query style (multi-faceted)
  • Simple, prompt-like (single-faceted)
  • Typical, prompt-like (single or multi-faceted)
  • Detailed, prompt-like (multi-faceted)

Generation was parallelized with 100 concurrent API calls in batches of 100 prompts, with results stored in SQLite.

2. Response Generation

Each prompt was sent back to Gemini 2.5 Flash (with thinking disabled) to produce a corresponding AI assistant response. This was also parallelized at 100 concurrent calls.

3. Tokenization

Prompt-response pairs were formatted as {response}\n###\n{prompt}<eos> and tokenized using the Gemma 3 tokenizer. Labels were masked (-100) over the response and separator tokens so the model only learns to predict the prompt portion. Tokenization was done in batches of 5,000 and concatenated into the final dataset.

Training Details

Parameter Value
Base model google/gemma-3-270m
Method Full fine-tune
Precision bfloat16
Epochs 1
Batch size 2
Gradient accumulation 8 (effective batch size 16)
Learning rate 5e-5
Warmup steps 100
Max sequence length 2048
Optimizer AdamW (torch fused)
Gradient checkpointing Enabled
Training time 4h 14m
GPU NVIDIA GeForce RTX 4090 (24 GB)
CPU AMD Ryzen 9 7950X3D 16-Core
RAM 128 GB

Training Loss

Training Loss

Inference Strategy

The companion app uses contrastive search with a sweep over configurations:

  • top_k: [2, 4, 6, 15]
  • penalty_alpha: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]

This produces up to 24 candidate prompts per input. Candidates are deduplicated and ranked by perplexity (lower is better). Token-level probabilities provide a confidence signal for each word in the reconstruction.

Limitations

  • Prompt reconstruction is inherently probabilistic. The model returns plausible prompts, not necessarily the exact original.
  • Performance is best on responses typical of AI assistants. Non-standard or very short inputs may produce lower-quality reconstructions.
  • The model inherits the capabilities and limitations of the gemma-3-270m base model.

Author

Dejan AI

Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for dejanseo/reverse-prompter

Finetuned
(134)
this model

Space using dejanseo/reverse-prompter 1