prowriting commited on
Commit
79f6639
Β·
verified Β·
1 Parent(s): 9c00f33

Fix tokenizer errors and restructure project for Space deployment

Browse files

- Removed old conflicting main.py file that was causing runtime errors
- Added proper app.py entrypoint compatible with Hugging Face Spaces
- Updated requirements.txt to include sentencepiece and tiktoken
- Ensured T5 tokenizer loads correctly by supporting SentencePiece
- Packaged all files into a clean zip for upload

Files changed (3) hide show
  1. README.md +31 -6
  2. app.py +27 -4
  3. requirements.txt +1 -1
README.md CHANGED
@@ -1,14 +1,39 @@
1
  ---
2
- title: My Hugging Face Space
3
- emoji: πŸš€
4
- colorFrom: blue
5
- colorTo: green
6
  sdk: gradio
7
  sdk_version: "4.29.0"
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # My Hugging Face Space
13
 
14
- This is a demo space fixed with proper configuration.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Paraphrasing App
3
+ emoji: πŸ”„
4
+ colorFrom: indigo
5
+ colorTo: blue
6
  sdk: gradio
7
  sdk_version: "4.29.0"
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ # πŸ”„ Paraphrasing App
13
 
14
+ This Space uses a **T5 transformer model** to paraphrase input text into different variations.
15
+ It is built with **Gradio** and **Hugging Face Transformers**.
16
+
17
+ ## πŸš€ Features
18
+ - Enter any sentence or paragraph
19
+ - Get multiple paraphrased outputs
20
+ - Powered by pretrained **T5 model**
21
+
22
+ ## πŸ› οΈ Requirements
23
+ All dependencies are listed in `requirements.txt`:
24
+ - `transformers`
25
+ - `torch`
26
+ - `sentencepiece`
27
+ - `tiktoken`
28
+ - `gradio`
29
+
30
+ ## πŸ’‘ Example
31
+ Input:
32
+ > "The quick brown fox jumps over the lazy dog."
33
+
34
+ Output:
35
+ - "A fast brown fox leaps over a lazy dog."
36
+ - "The lazy dog was jumped over by a quick brown fox."
37
+
38
+ ---
39
+ Built with ❀️ using Hugging Face Spaces
app.py CHANGED
@@ -1,7 +1,30 @@
1
  import gradio as gr
 
2
 
3
- def greet(name):
4
- return f"Hello {name}!"
 
 
5
 
6
- iface = gr.Interface(fn=greet, inputs="text", outputs="text")
7
- iface.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import gradio as gr
2
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
3
 
4
+ # Load model and tokenizer
5
+ model_name = "t5-small"
6
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
7
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
8
 
9
+ def paraphrase(text, num_return_sequences=3, num_beams=5):
10
+ input_text = "paraphrase: " + text + " </s>"
11
+ inputs = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)
12
+ outputs = model.generate(
13
+ inputs,
14
+ max_length=512,
15
+ num_beams=num_beams,
16
+ num_return_sequences=num_return_sequences,
17
+ temperature=1.5
18
+ )
19
+ return [tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True) for output in outputs]
20
+
21
+ demo = gr.Interface(
22
+ fn=paraphrase,
23
+ inputs=[gr.Textbox(lines=3, label="Enter text"), gr.Slider(1, 5, value=3, step=1, label="Number of outputs")],
24
+ outputs=gr.List(label="Paraphrased Sentences"),
25
+ title="πŸ”„ Paraphrasing App",
26
+ description="Paraphrase any input text using a pretrained T5 transformer model."
27
+ )
28
+
29
+ if __name__ == "__main__":
30
+ demo.launch()
requirements.txt CHANGED
@@ -1,5 +1,5 @@
1
- gradio==4.29.0
2
  transformers
3
  torch
4
  sentencepiece
5
  tiktoken
 
 
 
1
  transformers
2
  torch
3
  sentencepiece
4
  tiktoken
5
+ gradio==4.29.0