Model Card: T5-Base Fine-Tuned for Recipe Direction Generation (FP16)

Model Overview

Model Name: t5-base-recipe-finetuned-fp16
Model Type: Sequence-to-Sequence Transformer
Base Model: google/t5-base (220M parameters)
Quantization: FP16 (half-precision floating-point)
Task: Generate cooking directions from a list of ingredients

Intended Use

This model is designed to generate step-by-step cooking directions given a list of ingredients. It’s intended for:
Recipe creation assistance.
Educational purposes in culinary AI research.
Exploration of text-to-text generation in domain-specific tasks.
Primary Users: Home cooks, recipe developers, AI researchers.

Model Details

Architecture: T5 (Text-to-Text Transfer Transformer), encoder-decoder Transformer with 12 layers, 768 hidden size, 12 attention heads.
Input: Text string in the format "generate recipe directions from ingredients: ...".
Output: Text string containing cooking directions.
Quantization: Converted to FP16 for reduced memory usage (~425 MB vs. ~850 MB in FP32) and faster inference on GPU.
Hardware: Fine-tuned and tested on a 12 GB NVIDIA GPU with CUDA.

Training Data

Dataset: RecipeNLG
Source: Publicly available recipe dataset (downloaded as CSV)
Size: 2,231,142 examples (original); subset of 178,491 used for training (10% of train split)

Splits:

Train: 178,491 examples (subset)
Validation: 223,114 examples
Test: 223,115 examples
Attributes: ingredients (list of ingredients), directions (list of steps)
Preprocessing: Converted stringified lists to text; input prefixed with "generate recipe directions from ingredients: ".

Training Procedure

Framework: Hugging Face Transformers Hyperparameters:

Epochs: 2
Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
Learning Rate: 2e-5
Optimizer: AdamW
Mixed Precision: FP16 (fp16=True)
Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).

Evaluation

Metrics: Loss (to be filled post-training)
Validation Loss: [TBD after training]
Test Loss: [TBD after evaluation]
Method: Evaluated using Trainer.evaluate() on validation and test splits.
Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).

Performance

Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
Strengths: Expected to generate plausible directions for common ingredient combinations.

Limitations:

Limited training on subset may reduce generalization.
Sporadic data mismatches may affect output quality.
FP16 quantization might slightly alter precision vs. FP32.

Usage

Installation

pip install transformers torch datasets

Inference Example

from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

model_path = "./t5_recipe_finetuned_fp16"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda").half()

ingredients = ["1 lb chicken breast", "2 cups rice", "1 onion", "2 tbsp soy sauce"]
input_text = "generate recipe directions from ingredients: " + " ".join(ingredients)
input_ids = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True).input_ids.to("cuda")

model.eval()
with torch.no_grad():
    output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(directions)