YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card: T5-Base Fine-Tuned for Recipe Direction Generation (FP16)

Model Overview

  • Model Name: t5-base-recipe-finetuned-fp16
  • Model Type: Sequence-to-Sequence Transformer
  • Base Model: google/t5-base (220M parameters)
  • Quantization: FP16 (half-precision floating-point)
  • Task: Generate cooking directions from a list of ingredients

Intended Use

  • This model is designed to generate step-by-step cooking directions given a list of ingredients. It’s intended for:
  • Recipe creation assistance.
  • Educational purposes in culinary AI research.
  • Exploration of text-to-text generation in domain-specific tasks.
  • Primary Users: Home cooks, recipe developers, AI researchers.

Model Details

  • Architecture: T5 (Text-to-Text Transfer Transformer), encoder-decoder Transformer with 12 layers, 768 hidden size, 12 attention heads.
  • Input: Text string in the format "generate recipe directions from ingredients: ...".
  • Output: Text string containing cooking directions.
  • Quantization: Converted to FP16 for reduced memory usage (~425 MB vs. ~850 MB in FP32) and faster inference on GPU.
  • Hardware: Fine-tuned and tested on a 12 GB NVIDIA GPU with CUDA.

Training Data

  • Dataset: RecipeNLG
  • Source: Publicly available recipe dataset (downloaded as CSV)
  • Size: 2,231,142 examples (original); subset of 178,491 used for training (10% of train split)

Splits:

  • Train: 178,491 examples (subset)
  • Validation: 223,114 examples
  • Test: 223,115 examples
  • Attributes: ingredients (list of ingredients), directions (list of steps)
  • Preprocessing: Converted stringified lists to text; input prefixed with "generate recipe directions from ingredients: ".

Training Procedure

Framework: Hugging Face Transformers Hyperparameters:

  • Epochs: 2
  • Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
  • Learning Rate: 2e-5
  • Optimizer: AdamW
  • Mixed Precision: FP16 (fp16=True)
  • Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
  • Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).

Evaluation

  • Metrics: Loss (to be filled post-training)
  • Validation Loss: [TBD after training]
  • Test Loss: [TBD after evaluation]
  • Method: Evaluated using Trainer.evaluate() on validation and test splits.
  • Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).

Performance

  • Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
  • Strengths: Expected to generate plausible directions for common ingredient combinations.

Limitations:

  • Limited training on subset may reduce generalization.
  • Sporadic data mismatches may affect output quality.
  • FP16 quantization might slightly alter precision vs. FP32.

Usage

Installation

pip install transformers torch datasets

Inference Example

from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

model_path = "./t5_recipe_finetuned_fp16"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda").half()

ingredients = ["1 lb chicken breast", "2 cups rice", "1 onion", "2 tbsp soy sauce"]
input_text = "generate recipe directions from ingredients: " + " ".join(ingredients)
input_ids = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True).input_ids.to("cuda")

model.eval()
with torch.no_grad():
    output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(directions)
Downloads last month
7
Safetensors
Model size
223M params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.