Reverse Prompter
A fine-tuned google/gemma-3-270m model that reconstructs the most likely prompt from an AI assistant's response.
Given an AI-generated text, the model generates candidate prompts that could have produced it.
How It Works
The model was trained on prompt-response pairs formatted as:
{response}\n###\n{prompt}
At inference time, you provide the response followed by the \n###\n separator, and the model generates the reconstructed prompt.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("dejanseo/reverse-prompter", torch_dtype="bfloat16").cuda().eval()
tokenizer = AutoTokenizer.from_pretrained("dejanseo/reverse-prompter")
response_text = "Your AI-generated text here"
prompt = response_text.strip() + "\n###\n"
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, penalty_alpha=0.3, top_k=4)
generated = outputs[0][inputs["input_ids"].shape[-1]:]
reconstructed_prompt = tokenizer.decode(generated, skip_special_tokens=True).strip()
print(reconstructed_prompt)
For best results, run generation across multiple contrastive search configurations and rank outputs by perplexity. See the companion Streamlit app for a full implementation.
Training Data
The training dataset was generated synthetically using Gemini 2.5 Flash via Vertex AI in a three-stage pipeline:
1. Prompt Generation
100,000 diverse prompts were generated across five categories (20 each per batch):
- Mid-tail, search query style (single or multi-faceted)
- Long-tail, search query style (multi-faceted)
- Simple, prompt-like (single-faceted)
- Typical, prompt-like (single or multi-faceted)
- Detailed, prompt-like (multi-faceted)
Generation was parallelized with 100 concurrent API calls in batches of 100 prompts, with results stored in SQLite.
2. Response Generation
Each prompt was sent back to Gemini 2.5 Flash (with thinking disabled) to produce a corresponding AI assistant response. This was also parallelized at 100 concurrent calls.
3. Tokenization
Prompt-response pairs were formatted as {response}\n###\n{prompt}<eos> and tokenized using the Gemma 3 tokenizer. Labels were masked (-100) over the response and separator tokens so the model only learns to predict the prompt portion. Tokenization was done in batches of 5,000 and concatenated into the final dataset.
Training Details
| Parameter | Value |
|---|---|
| Base model | google/gemma-3-270m |
| Method | Full fine-tune |
| Precision | bfloat16 |
| Epochs | 1 |
| Batch size | 2 |
| Gradient accumulation | 8 (effective batch size 16) |
| Learning rate | 5e-5 |
| Warmup steps | 100 |
| Max sequence length | 2048 |
| Optimizer | AdamW (torch fused) |
| Gradient checkpointing | Enabled |
| Training time | 4h 14m |
| GPU | NVIDIA GeForce RTX 4090 (24 GB) |
| CPU | AMD Ryzen 9 7950X3D 16-Core |
| RAM | 128 GB |
Training Loss
Inference Strategy
The companion app uses contrastive search with a sweep over configurations:
top_k: [2, 4, 6, 15]penalty_alpha: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
This produces up to 24 candidate prompts per input. Candidates are deduplicated and ranked by perplexity (lower is better). Token-level probabilities provide a confidence signal for each word in the reconstruction.
Limitations
- Prompt reconstruction is inherently probabilistic. The model returns plausible prompts, not necessarily the exact original.
- Performance is best on responses typical of AI assistants. Non-standard or very short inputs may produce lower-quality reconstructions.
- The model inherits the capabilities and limitations of the gemma-3-270m base model.
Author
- Downloads last month
- -
Model tree for dejanseo/reverse-prompter
Base model
google/gemma-3-270m