Update README.md
Browse files
README.md
CHANGED
@@ -34,44 +34,34 @@
|
|
34 |
# Training Procedure
|
35 |
**Framework:** Hugging Face Transformers
|
36 |
**Hyperparameters:**
|
37 |
-
Epochs:
|
38 |
-
Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
|
39 |
-
Learning Rate: 2e-5
|
40 |
-
Optimizer: AdamW
|
41 |
-
Mixed Precision: FP16 (fp16=True)
|
42 |
-
Training Time: ~
|
43 |
-
Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
|
44 |
-
Evaluation
|
45 |
-
Metrics: Loss (to be filled post-training)
|
46 |
-
Validation Loss: [TBD after training]
|
47 |
-
Test Loss: [TBD after evaluation]
|
48 |
-
Method: Evaluated using Trainer.evaluate() on validation and test splits.
|
49 |
-
Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
|
50 |
-
Performance
|
51 |
-
Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
|
52 |
-
Strengths: Expected to generate plausible directions for common ingredient combinations.
|
53 |
-
Limitations:
|
54 |
-
Limited training on subset may reduce generalization.
|
55 |
-
Sporadic data mismatches may affect output quality.
|
56 |
-
FP16 quantization might slightly alter precision vs. FP32.
|
57 |
-
Usage
|
58 |
-
Installation
|
59 |
-
|
60 |
-
|
61 |
-
Collapse
|
62 |
-
|
63 |
-
Wrap
|
64 |
-
|
65 |
-
Copy
|
66 |
pip install transformers torch datasets
|
67 |
-
|
68 |
-
|
69 |
|
70 |
-
|
71 |
-
|
72 |
-
Wrap
|
73 |
-
|
74 |
-
Copy
|
75 |
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
76 |
import torch
|
77 |
|
@@ -88,22 +78,4 @@ with torch.no_grad():
|
|
88 |
output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
|
89 |
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
90 |
print(directions)
|
91 |
-
|
92 |
-
Location: ./t5_recipe_finetuned_fp16
|
93 |
-
Size: ~425 MB (FP16 weights)
|
94 |
-
Limitations and Biases
|
95 |
-
Data Quality: Some RecipeNLG entries have mismatched ingredients and directions, potentially leading to nonsensical outputs.
|
96 |
-
Scope: Trained only on English recipes; may not handle non-English inputs or exotic cuisines well.
|
97 |
-
Bias: Reflects biases in RecipeNLG (e.g., Western cuisine dominance).
|
98 |
-
Quantization: FP16 may introduce minor numerical differences vs. FP32, though mitigated by FP16 training.
|
99 |
-
Ethical Considerations
|
100 |
-
Use: Should not be used to replace professional culinary expertise without validation.
|
101 |
-
Safety: Generated directions aren’t guaranteed to be safe or accurate (e.g., cooking times, temperatures).
|
102 |
-
Contact
|
103 |
-
Author: [Your Name/Group Name]
|
104 |
-
Support: [Your Email/GitHub, if applicable]
|
105 |
-
Citation
|
106 |
-
If you use this model, please cite:
|
107 |
-
|
108 |
-
RecipeNLG dataset: [Add citation if available]
|
109 |
-
T5: Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" (2020)
|
|
|
34 |
# Training Procedure
|
35 |
**Framework:** Hugging Face Transformers
|
36 |
**Hyperparameters:**
|
37 |
+
- Epochs: 2
|
38 |
+
- Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
|
39 |
+
- Learning Rate: 2e-5
|
40 |
+
- Optimizer: AdamW
|
41 |
+
- Mixed Precision: FP16 (fp16=True)
|
42 |
+
- Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
|
43 |
+
- Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
|
44 |
+
# Evaluation
|
45 |
+
- Metrics: Loss (to be filled post-training)
|
46 |
+
- Validation Loss: [TBD after training]
|
47 |
+
- Test Loss: [TBD after evaluation]
|
48 |
+
- Method: Evaluated using Trainer.evaluate() on validation and test splits.
|
49 |
+
- Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
|
50 |
+
# Performance
|
51 |
+
- Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
|
52 |
+
- Strengths: Expected to generate plausible directions for common ingredient combinations.
|
53 |
+
# Limitations:
|
54 |
+
- Limited training on subset may reduce generalization.
|
55 |
+
- Sporadic data mismatches may affect output quality.
|
56 |
+
- FP16 quantization might slightly alter precision vs. FP32.
|
57 |
+
# Usage
|
58 |
+
# Installation
|
59 |
+
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
60 |
pip install transformers torch datasets
|
61 |
+
```
|
62 |
+
# Inference Example
|
63 |
|
64 |
+
```python
|
|
|
|
|
|
|
|
|
65 |
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
66 |
import torch
|
67 |
|
|
|
78 |
output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
|
79 |
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
80 |
print(directions)
|
81 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|