vittoriopippi
commited on
Commit
·
5084d4b
1
Parent(s):
a8a6d74
Edit README.md
Browse files
README.md
CHANGED
@@ -19,60 +19,68 @@ metrics:
|
|
19 |
- CER
|
20 |
---
|
21 |
|
22 |
-
# VATr++ (
|
23 |
|
24 |
-
This is a
|
25 |
|
26 |
-
> **Note**:
|
27 |
> - Full training instructions
|
28 |
-
> -
|
29 |
-
> -
|
30 |
|
31 |
-
|
32 |
|
33 |
---
|
34 |
|
35 |
-
## Installation
|
36 |
|
37 |
-
1. **
|
|
|
|
|
|
|
|
|
|
|
38 |
```bash
|
39 |
conda create --name vatr python=3.9
|
40 |
conda activate vatr
|
41 |
```
|
42 |
|
43 |
-
|
44 |
```bash
|
45 |
-
|
46 |
```
|
47 |
|
48 |
-
|
49 |
```bash
|
50 |
-
pip install
|
51 |
```
|
52 |
-
*You may need to adjust or add libraries based on your specific environment needs.*
|
53 |
|
54 |
---
|
55 |
|
56 |
-
## Loading the Model
|
|
|
|
|
|
|
|
|
57 |
|
58 |
-
#### **VATr++**
|
59 |
-
To load the **VATr++** version:
|
60 |
```python
|
61 |
-
from
|
62 |
|
63 |
-
model_vatr_pp =
|
64 |
-
"
|
65 |
-
|
66 |
)
|
67 |
```
|
68 |
|
69 |
-
|
70 |
-
|
71 |
```python
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
|
|
|
|
76 |
)
|
77 |
```
|
78 |
|
@@ -80,23 +88,21 @@ model_vatr = AutoModel.from_pretrained(
|
|
80 |
|
81 |
## Usage (Inference Example)
|
82 |
|
83 |
-
Below is a **minimal** usage example
|
84 |
|
85 |
-
1. Load the VATr
|
86 |
2. Preprocess a style image (an image of handwriting).
|
87 |
-
3. Generate
|
88 |
-
|
89 |
-
> **Important**: This model requires `trust_remote_code=True` to properly load its custom generation logic.
|
90 |
|
91 |
```python
|
92 |
import numpy as np
|
93 |
from PIL import Image
|
94 |
import torch
|
95 |
from torchvision import transforms as T
|
96 |
-
from transformers import AutoModel
|
97 |
|
98 |
# 1. Load the model (VATr++)
|
99 |
-
|
|
|
100 |
|
101 |
# 2. Helper functions to load and process style images
|
102 |
def load_image(img, chunk_width=192):
|
@@ -168,27 +174,22 @@ generated_pil_image.save("generated_output.png")
|
|
168 |
|
169 |
- **`style_imgs`**: A batch of fixed-width image chunks from your style reference. In practice, you can supply multiple small style samples or a single line image split into chunks.
|
170 |
- **`gen_text`**: The text to render in the given style.
|
171 |
-
- **`align_words`** and **`at_once`**: Optional arguments
|
172 |
|
173 |
---
|
174 |
|
175 |
## Original Repository
|
176 |
|
177 |
-
This model is built upon the code from [**EDM-Research/VATr-pp**](https://github.com/EDM-Research/VATr-pp),
|
178 |
-
- Train your own model from scratch
|
179 |
-
- Explore advanced features (like style cycle loss, punctuation modes, or advanced augmentation)
|
180 |
-
- Examine experimental details or replicate the original paper's setup
|
181 |
|
182 |
-
|
|
|
|
|
183 |
|
184 |
---
|
185 |
|
186 |
## License and Acknowledgments
|
187 |
|
188 |
- The original code and model are under the license found in [the GitHub repository](https://github.com/EDM-Research/VATr-pp).
|
189 |
-
- All credit goes to the original authors and maintainers for creating VATr++ and releasing it openly.
|
190 |
-
- This
|
191 |
-
|
192 |
-
---
|
193 |
-
|
194 |
-
**Enjoy generating styled handwritten text!** For any issues specific to this Hugging Face version, feel free to open an issue or pull request here. Otherwise, for deeper technical questions, please consult the original repository or its authors.
|
|
|
19 |
- CER
|
20 |
---
|
21 |
|
22 |
+
# VATr++ (Local Clone Version)
|
23 |
|
24 |
+
This is a local-clone-friendly version of the **VATr++** styled handwritten text generation model. If you prefer not to rely on `transformers`’s `trust_remote_code=True`, you can simply clone this repository and load the model directly.
|
25 |
|
26 |
+
> **Note**: For:
|
27 |
> - Full training instructions
|
28 |
+
> - Advanced features (style cycle loss, punctuation modes, etc.)
|
29 |
+
> - Original code details
|
30 |
|
31 |
+
please see the [VATr-pp GitHub repository](https://github.com/EDM-Research/VATr-pp). This local version is intended primarily for inference and basic usage.
|
32 |
|
33 |
---
|
34 |
|
35 |
+
## Installation & Setup
|
36 |
|
37 |
+
1. **Clone this repository (via Git LFS)**:
|
38 |
+
```bash
|
39 |
+
git lfs clone https://huggingface.co/blowing-up-groundhogs/vatrpp
|
40 |
+
```
|
41 |
+
|
42 |
+
2. **Create (and activate) a conda environment (recommended)**:
|
43 |
```bash
|
44 |
conda create --name vatr python=3.9
|
45 |
conda activate vatr
|
46 |
```
|
47 |
|
48 |
+
3. **Install PyTorch (with CUDA if available)**:
|
49 |
```bash
|
50 |
+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
|
51 |
```
|
52 |
|
53 |
+
4. **Install additional requirements**:
|
54 |
```bash
|
55 |
+
pip install opencv-python matplotlib
|
56 |
```
|
|
|
57 |
|
58 |
---
|
59 |
|
60 |
+
## Loading the Model Locally
|
61 |
+
|
62 |
+
With the repository cloned, you can load either **VATr++** or the **original VATr** model locally.
|
63 |
+
|
64 |
+
### **VATr++**
|
65 |
|
|
|
|
|
66 |
```python
|
67 |
+
from vatrpp import VATrPP
|
68 |
|
69 |
+
model_vatr_pp = VATrPP.from_pretrained(
|
70 |
+
"vatrpp", # Local folder name or path
|
71 |
+
local_files_only=True
|
72 |
)
|
73 |
```
|
74 |
|
75 |
+
### **VATr (original)**
|
76 |
+
|
77 |
```python
|
78 |
+
from vatrpp import VATr
|
79 |
+
|
80 |
+
model_vatr = VATr.from_pretrained(
|
81 |
+
"vatrpp",
|
82 |
+
local_files_only=True,
|
83 |
+
subfolder="vatr" # Points to the original VATr checkpoint
|
84 |
)
|
85 |
```
|
86 |
|
|
|
88 |
|
89 |
## Usage (Inference Example)
|
90 |
|
91 |
+
Below is a **minimal** usage example demonstrating how to:
|
92 |
|
93 |
+
1. Load the **VATr++** model from your local clone.
|
94 |
2. Preprocess a style image (an image of handwriting).
|
95 |
+
3. Generate new handwritten text in the style of the provided image.
|
|
|
|
|
96 |
|
97 |
```python
|
98 |
import numpy as np
|
99 |
from PIL import Image
|
100 |
import torch
|
101 |
from torchvision import transforms as T
|
|
|
102 |
|
103 |
# 1. Load the model (VATr++)
|
104 |
+
from vatrpp import VATrPP
|
105 |
+
model = VATrPP.from_pretrained("vatrpp", local_files_only=True)
|
106 |
|
107 |
# 2. Helper functions to load and process style images
|
108 |
def load_image(img, chunk_width=192):
|
|
|
174 |
|
175 |
- **`style_imgs`**: A batch of fixed-width image chunks from your style reference. In practice, you can supply multiple small style samples or a single line image split into chunks.
|
176 |
- **`gen_text`**: The text to render in the given style.
|
177 |
+
- **`align_words`** and **`at_once`**: Optional arguments controlling how the text is laid out and generated.
|
178 |
|
179 |
---
|
180 |
|
181 |
## Original Repository
|
182 |
|
183 |
+
This model is built upon the code from [**EDM-Research/VATr-pp**](https://github.com/EDM-Research/VATr-pp), itself an improvement on the [VATr](https://github.com/aimagelab/VATr) project. Please visit those repositories if you need to:
|
|
|
|
|
|
|
184 |
|
185 |
+
- Train your own model from scratch
|
186 |
+
- Explore advanced features (like style cycle loss, punctuation modes, or advanced augmentation)
|
187 |
+
- Examine experimental details or replicate the original paper's setup
|
188 |
|
189 |
---
|
190 |
|
191 |
## License and Acknowledgments
|
192 |
|
193 |
- The original code and model are under the license found in [the GitHub repository](https://github.com/EDM-Research/VATr-pp).
|
194 |
+
- All credit goes to the original authors and maintainers for creating VATr++ and releasing it openly.
|
195 |
+
- This local version is intended to simplify offline usage and keep everything self-contained.
|
|
|
|
|
|
|
|