vittoriopippi commited on
Commit
5084d4b
·
1 Parent(s): a8a6d74

Edit README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -45
README.md CHANGED
@@ -19,60 +19,68 @@ metrics:
19
  - CER
20
  ---
21
 
22
- # VATr++ (Hugging Face Version)
23
 
24
- This is a re-upload of the **VATr++** styled handwritten text generation model to the Hugging Face Model Hub. The original code and more detailed documentation can be found in the [VATr-pp GitHub repository](https://github.com/EDM-Research/VATr-pp).
25
 
26
- > **Note**: Please refer to the original repo for:
27
  > - Full training instructions
28
- > - In-depth code details
29
- > - Extended usage and references
30
 
31
- This Hugging Face version allows you to directly load the **VATr++** model with `AutoModel.from_pretrained(...)` and use it in your pipelines or scripts without manually handling checkpoints. The usage differs slightly from the original GitHub repository, primarily because we leverage Hugging Face’s `transformers` interface here.
32
 
33
  ---
34
 
35
- ## Installation
36
 
37
- 1. **Create a conda environment (recommended)**:
 
 
 
 
 
38
  ```bash
39
  conda create --name vatr python=3.9
40
  conda activate vatr
41
  ```
42
 
43
- 2. **Install PyTorch and CUDA (if available)**:
44
  ```bash
45
- conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
46
  ```
47
 
48
- 3. **Install additional requirements** (including `transformers`, `opencv`, etc.):
49
  ```bash
50
- pip install transformers opencv-python
51
  ```
52
- *You may need to adjust or add libraries based on your specific environment needs.*
53
 
54
  ---
55
 
56
- ## Loading the Model
 
 
 
 
57
 
58
- #### **VATr++**
59
- To load the **VATr++** version:
60
  ```python
61
- from transformers import AutoModel
62
 
63
- model_vatr_pp = AutoModel.from_pretrained(
64
- "blowing-up-groundhogs/vatrpp",
65
- trust_remote_code=True
66
  )
67
  ```
68
 
69
- #### **VATr (original)**
70
- To load the **original VATr** model (instead of VATr++), specify the `subfolder` argument:
71
  ```python
72
- model_vatr = AutoModel.from_pretrained(
73
- "blowing-up-groundhogs/vatrpp",
74
- subfolder="vatr",
75
- trust_remote_code=True
 
 
76
  )
77
  ```
78
 
@@ -80,23 +88,21 @@ model_vatr = AutoModel.from_pretrained(
80
 
81
  ## Usage (Inference Example)
82
 
83
- Below is a **minimal** usage example that demonstrates how to:
84
 
85
- 1. Load the VATr++ model from the Hugging Face Hub.
86
  2. Preprocess a style image (an image of handwriting).
87
- 3. Generate a new handwritten line of text in the style of the provided image.
88
-
89
- > **Important**: This model requires `trust_remote_code=True` to properly load its custom generation logic.
90
 
91
  ```python
92
  import numpy as np
93
  from PIL import Image
94
  import torch
95
  from torchvision import transforms as T
96
- from transformers import AutoModel
97
 
98
  # 1. Load the model (VATr++)
99
- model = AutoModel.from_pretrained("blowing-up-groundhogs/vatrpp", trust_remote_code=True)
 
100
 
101
  # 2. Helper functions to load and process style images
102
  def load_image(img, chunk_width=192):
@@ -168,27 +174,22 @@ generated_pil_image.save("generated_output.png")
168
 
169
  - **`style_imgs`**: A batch of fixed-width image chunks from your style reference. In practice, you can supply multiple small style samples or a single line image split into chunks.
170
  - **`gen_text`**: The text to render in the given style.
171
- - **`align_words`** and **`at_once`**: Optional arguments that control how the text is laid out and generated.
172
 
173
  ---
174
 
175
  ## Original Repository
176
 
177
- This model is built upon the code from [**EDM-Research/VATr-pp**](https://github.com/EDM-Research/VATr-pp), which is itself an improvement on the [VATr](https://github.com/aimagelab/VATr) project. If you need to:
178
- - Train your own model from scratch
179
- - Explore advanced features (like style cycle loss, punctuation modes, or advanced augmentation)
180
- - Examine experimental details or replicate the original paper's setup
181
 
182
- Please visit the original GitHub repos for comprehensive documentation and support files.
 
 
183
 
184
  ---
185
 
186
  ## License and Acknowledgments
187
 
188
  - The original code and model are under the license found in [the GitHub repository](https://github.com/EDM-Research/VATr-pp).
189
- - All credit goes to the original authors and maintainers for creating VATr++ and releasing it openly.
190
- - This Hugging Face re-upload is merely intended to **simplify inference** and **model sharing**; no changes have been made to the core training code or conceptual pipeline.
191
-
192
- ---
193
-
194
- **Enjoy generating styled handwritten text!** For any issues specific to this Hugging Face version, feel free to open an issue or pull request here. Otherwise, for deeper technical questions, please consult the original repository or its authors.
 
19
  - CER
20
  ---
21
 
22
+ # VATr++ (Local Clone Version)
23
 
24
+ This is a local-clone-friendly version of the **VATr++** styled handwritten text generation model. If you prefer not to rely on `transformers`’s `trust_remote_code=True`, you can simply clone this repository and load the model directly.
25
 
26
+ > **Note**: For:
27
  > - Full training instructions
28
+ > - Advanced features (style cycle loss, punctuation modes, etc.)
29
+ > - Original code details
30
 
31
+ please see the [VATr-pp GitHub repository](https://github.com/EDM-Research/VATr-pp). This local version is intended primarily for inference and basic usage.
32
 
33
  ---
34
 
35
+ ## Installation & Setup
36
 
37
+ 1. **Clone this repository (via Git LFS)**:
38
+ ```bash
39
+ git lfs clone https://huggingface.co/blowing-up-groundhogs/vatrpp
40
+ ```
41
+
42
+ 2. **Create (and activate) a conda environment (recommended)**:
43
  ```bash
44
  conda create --name vatr python=3.9
45
  conda activate vatr
46
  ```
47
 
48
+ 3. **Install PyTorch (with CUDA if available)**:
49
  ```bash
50
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
51
  ```
52
 
53
+ 4. **Install additional requirements**:
54
  ```bash
55
+ pip install opencv-python matplotlib
56
  ```
 
57
 
58
  ---
59
 
60
+ ## Loading the Model Locally
61
+
62
+ With the repository cloned, you can load either **VATr++** or the **original VATr** model locally.
63
+
64
+ ### **VATr++**
65
 
 
 
66
  ```python
67
+ from vatrpp import VATrPP
68
 
69
+ model_vatr_pp = VATrPP.from_pretrained(
70
+ "vatrpp", # Local folder name or path
71
+ local_files_only=True
72
  )
73
  ```
74
 
75
+ ### **VATr (original)**
76
+
77
  ```python
78
+ from vatrpp import VATr
79
+
80
+ model_vatr = VATr.from_pretrained(
81
+ "vatrpp",
82
+ local_files_only=True,
83
+ subfolder="vatr" # Points to the original VATr checkpoint
84
  )
85
  ```
86
 
 
88
 
89
  ## Usage (Inference Example)
90
 
91
+ Below is a **minimal** usage example demonstrating how to:
92
 
93
+ 1. Load the **VATr++** model from your local clone.
94
  2. Preprocess a style image (an image of handwriting).
95
+ 3. Generate new handwritten text in the style of the provided image.
 
 
96
 
97
  ```python
98
  import numpy as np
99
  from PIL import Image
100
  import torch
101
  from torchvision import transforms as T
 
102
 
103
  # 1. Load the model (VATr++)
104
+ from vatrpp import VATrPP
105
+ model = VATrPP.from_pretrained("vatrpp", local_files_only=True)
106
 
107
  # 2. Helper functions to load and process style images
108
  def load_image(img, chunk_width=192):
 
174
 
175
  - **`style_imgs`**: A batch of fixed-width image chunks from your style reference. In practice, you can supply multiple small style samples or a single line image split into chunks.
176
  - **`gen_text`**: The text to render in the given style.
177
+ - **`align_words`** and **`at_once`**: Optional arguments controlling how the text is laid out and generated.
178
 
179
  ---
180
 
181
  ## Original Repository
182
 
183
+ This model is built upon the code from [**EDM-Research/VATr-pp**](https://github.com/EDM-Research/VATr-pp), itself an improvement on the [VATr](https://github.com/aimagelab/VATr) project. Please visit those repositories if you need to:
 
 
 
184
 
185
+ - Train your own model from scratch
186
+ - Explore advanced features (like style cycle loss, punctuation modes, or advanced augmentation)
187
+ - Examine experimental details or replicate the original paper's setup
188
 
189
  ---
190
 
191
  ## License and Acknowledgments
192
 
193
  - The original code and model are under the license found in [the GitHub repository](https://github.com/EDM-Research/VATr-pp).
194
+ - All credit goes to the original authors and maintainers for creating VATr++ and releasing it openly.
195
+ - This local version is intended to simplify offline usage and keep everything self-contained.