YashikaNagpal commited on
Commit
00e75df
·
verified ·
1 Parent(s): 43f911f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -55
README.md CHANGED
@@ -34,44 +34,34 @@
34
  # Training Procedure
35
  **Framework:** Hugging Face Transformers
36
  **Hyperparameters:**
37
- Epochs: 1 (subset training; full training planned for 3 epochs)
38
- Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
39
- Learning Rate: 2e-5
40
- Optimizer: AdamW (default in Trainer)
41
- Mixed Precision: FP16 (fp16=True)
42
- Training Time: ~2.3 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
43
- Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
44
- Evaluation
45
- Metrics: Loss (to be filled post-training)
46
- Validation Loss: [TBD after training]
47
- Test Loss: [TBD after evaluation]
48
- Method: Evaluated using Trainer.evaluate() on validation and test splits.
49
- Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
50
- Performance
51
- Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
52
- Strengths: Expected to generate plausible directions for common ingredient combinations.
53
- Limitations:
54
- Limited training on subset may reduce generalization.
55
- Sporadic data mismatches may affect output quality.
56
- FP16 quantization might slightly alter precision vs. FP32.
57
- Usage
58
- Installation
59
- bash
60
-
61
- Collapse
62
-
63
- Wrap
64
-
65
- Copy
66
  pip install transformers torch datasets
67
- Inference Example
68
- python
69
 
70
- Collapse
71
-
72
- Wrap
73
-
74
- Copy
75
  from transformers import T5Tokenizer, T5ForConditionalGeneration
76
  import torch
77
 
@@ -88,22 +78,4 @@ with torch.no_grad():
88
  output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
89
  directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
90
  print(directions)
91
- Saved Model
92
- Location: ./t5_recipe_finetuned_fp16
93
- Size: ~425 MB (FP16 weights)
94
- Limitations and Biases
95
- Data Quality: Some RecipeNLG entries have mismatched ingredients and directions, potentially leading to nonsensical outputs.
96
- Scope: Trained only on English recipes; may not handle non-English inputs or exotic cuisines well.
97
- Bias: Reflects biases in RecipeNLG (e.g., Western cuisine dominance).
98
- Quantization: FP16 may introduce minor numerical differences vs. FP32, though mitigated by FP16 training.
99
- Ethical Considerations
100
- Use: Should not be used to replace professional culinary expertise without validation.
101
- Safety: Generated directions aren’t guaranteed to be safe or accurate (e.g., cooking times, temperatures).
102
- Contact
103
- Author: [Your Name/Group Name]
104
- Support: [Your Email/GitHub, if applicable]
105
- Citation
106
- If you use this model, please cite:
107
-
108
- RecipeNLG dataset: [Add citation if available]
109
- T5: Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" (2020)
 
34
  # Training Procedure
35
  **Framework:** Hugging Face Transformers
36
  **Hyperparameters:**
37
+ - Epochs: 2
38
+ - Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
39
+ - Learning Rate: 2e-5
40
+ - Optimizer: AdamW
41
+ - Mixed Precision: FP16 (fp16=True)
42
+ - Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
43
+ - Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
44
+ # Evaluation
45
+ - Metrics: Loss (to be filled post-training)
46
+ - Validation Loss: [TBD after training]
47
+ - Test Loss: [TBD after evaluation]
48
+ - Method: Evaluated using Trainer.evaluate() on validation and test splits.
49
+ - Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
50
+ # Performance
51
+ - Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
52
+ - Strengths: Expected to generate plausible directions for common ingredient combinations.
53
+ # Limitations:
54
+ - Limited training on subset may reduce generalization.
55
+ - Sporadic data mismatches may affect output quality.
56
+ - FP16 quantization might slightly alter precision vs. FP32.
57
+ # Usage
58
+ # Installation
59
+ ```python
 
 
 
 
 
 
60
  pip install transformers torch datasets
61
+ ```
62
+ # Inference Example
63
 
64
+ ```python
 
 
 
 
65
  from transformers import T5Tokenizer, T5ForConditionalGeneration
66
  import torch
67
 
 
78
  output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
79
  directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
80
  print(directions)
81
+ ```