YashikaNagpal's picture
Create README.md
43f911f verified
|
raw
history blame
4.88 kB

Model Card: T5-Base Fine-Tuned for Recipe Direction Generation (FP16)

Model Overview

  • Model Name: t5-base-recipe-finetuned-fp16
  • Model Type: Sequence-to-Sequence Transformer
  • Base Model: google/t5-base (220M parameters)
  • Quantization: FP16 (half-precision floating-point)
  • Task: Generate cooking directions from a list of ingredients

Intended Use

  • This model is designed to generate step-by-step cooking directions given a list of ingredients. It’s intended for:
  • Recipe creation assistance.
  • Educational purposes in culinary AI research.
  • Exploration of text-to-text generation in domain-specific tasks.
  • Primary Users: Home cooks, recipe developers, AI researchers.

Model Details

  • Architecture: T5 (Text-to-Text Transfer Transformer), encoder-decoder Transformer with 12 layers, 768 hidden size, 12 attention heads.
  • Input: Text string in the format "generate recipe directions from ingredients: ...".
  • Output: Text string containing cooking directions.
  • Quantization: Converted to FP16 for reduced memory usage (~425 MB vs. ~850 MB in FP32) and faster inference on GPU.
  • Hardware: Fine-tuned and tested on a 12 GB NVIDIA GPU with CUDA.

Training Data

  • Dataset: RecipeNLG
  • Source: Publicly available recipe dataset (downloaded as CSV)
  • Size: 2,231,142 examples (original); subset of 178,491 used for training (10% of train split)

Splits:

  • Train: 178,491 examples (subset)
  • Validation: 223,114 examples
  • Test: 223,115 examples
  • Attributes: ingredients (list of ingredients), directions (list of steps)
  • Preprocessing: Converted stringified lists to text; input prefixed with "generate recipe directions from ingredients: ".

Training Procedure

Framework: Hugging Face Transformers Hyperparameters: Epochs: 1 (subset training; full training planned for 3 epochs) Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps) Learning Rate: 2e-5 Optimizer: AdamW (default in Trainer) Mixed Precision: FP16 (fp16=True) Training Time: ~2.3 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization. Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled). Evaluation Metrics: Loss (to be filled post-training) Validation Loss: [TBD after training] Test Loss: [TBD after evaluation] Method: Evaluated using Trainer.evaluate() on validation and test splits. Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps). Performance Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"] Strengths: Expected to generate plausible directions for common ingredient combinations. Limitations: Limited training on subset may reduce generalization. Sporadic data mismatches may affect output quality. FP16 quantization might slightly alter precision vs. FP32. Usage Installation bash

Collapse

Wrap

Copy pip install transformers torch datasets Inference Example python

Collapse

Wrap

Copy from transformers import T5Tokenizer, T5ForConditionalGeneration import torch

model_path = "./t5_recipe_finetuned_fp16" tokenizer = T5Tokenizer.from_pretrained(model_path) model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda").half()

ingredients = ["1 lb chicken breast", "2 cups rice", "1 onion", "2 tbsp soy sauce"] input_text = "generate recipe directions from ingredients: " + " ".join(ingredients) input_ids = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True).input_ids.to("cuda")

model.eval() with torch.no_grad(): output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2) directions = tokenizer.decode(output_ids[0], skip_special_tokens=True) print(directions) Saved Model Location: ./t5_recipe_finetuned_fp16 Size: ~425 MB (FP16 weights) Limitations and Biases Data Quality: Some RecipeNLG entries have mismatched ingredients and directions, potentially leading to nonsensical outputs. Scope: Trained only on English recipes; may not handle non-English inputs or exotic cuisines well. Bias: Reflects biases in RecipeNLG (e.g., Western cuisine dominance). Quantization: FP16 may introduce minor numerical differences vs. FP32, though mitigated by FP16 training. Ethical Considerations Use: Should not be used to replace professional culinary expertise without validation. Safety: Generated directions aren’t guaranteed to be safe or accurate (e.g., cooking times, temperatures). Contact Author: [Your Name/Group Name] Support: [Your Email/GitHub, if applicable] Citation If you use this model, please cite:

RecipeNLG dataset: [Add citation if available] T5: Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" (2020)