YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

BERT-Base-Uncased Quantized Model for Twitter Tweet Sentiment Classification

This repository hosts a quantized version of the T5-Base model, fine-tuned for Movie Script Writting. The model is optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments such as mobile and edge devices.

Model Details

  • Model Architecture: T5-Base
  • Task: Movie Script Writting
  • Dataset: bookcorpus
  • Quantization: Float16 (FP16)
  • Fine-tuning Framework: Hugging Face Transformers
  • Inference Framework: PyTorch

Usage

Installation

pip install transformers torch

Loading the Model

from transformers import BertForSequenceClassification, BertTokenizer
import torch

# Load quantized model
quantized_model_path = "path/to/bert_finetuned_fp16"


def generate_script(prompt):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")  # Check available device
    model.to(device)  # Move model to the appropriate device
    
    inputs = tokenizer(f"Generate a movie script: {prompt}", return_tensors="pt", truncation=True, padding="max_length", max_length=256)
    inputs = {key: value.to(device) for key, value in inputs.items()}  # Move inputs to same device as model

    with torch.no_grad():
        outputs = model.generate(**inputs, max_length=256, num_return_sequences=1)

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Test the script generator
prompt = "SCENE: EXT. DARK ALLEY - NIGHT"
print(generate_script(prompt))


## Performance Metrics

- **Accuracy:** 0.82  
- **Inference Speed:** Faster due to FP16 quantization  

## Fine-Tuning Details

### Dataset



### Training Configuration

- **Number of epochs:** 3  
- **Batch size:** 8  
- **Evaluation strategy:** Per epoch  
- **Learning rate:** 2e-5  
- **Optimizer:** AdamW  

### Quantization

The model is quantized using **Post-Training Quantization (PTQ)** with **Float16 (FP16)**, which reduces model size and improves inference efficiency while maintaining accuracy.

## Repository Structure

. β”œβ”€β”€ model/ # Contains the quantized model files β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files β”œβ”€β”€ model.safensors/ # Fine-tuned and quantized model β”œβ”€β”€ README.md # Model documentation


## Limitations

- The model is optimized for English-language next-word prediction tasks.
- While quantization improves speed, minor accuracy degradation may occur.
- Performance on out-of-distribution text (e.g., highly technical or domain-specific data) may be limited.

## Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
``
Downloads last month
0
Safetensors
Model size
60.5M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.