Model Details of TVL_GeneralLayerClassifier

Base Model

This model is fine-tuned from google-bert/bert-base-chinese.

Model Architecture

Type: BERT-based text classification model
Hidden Size: 768
Number of Layers: 12
Number of Attention Heads: 12
Intermediate Size: 3072
Max Sequence Length: 512
Vocabulary Size: 21,128

Key Components

Embeddings
- Word Embeddings
- Position Embeddings
- Token Type Embeddings
- Layer Normalization
Encoder
- 12 layers of:
  - Self-Attention Mechanism
  - Intermediate Dense Layer
  - Output Dense Layer
  - Layer Normalization
Pooler
- Dense layer for sentence representation
Classifier
- Output layer with 4 classes

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Training Infrastructure

Hardware Type: NVIDIA Quadro RTX8000
Library: PyTorch
Hours used: 2hr 56mins

Model Parameters

Total parameters: ~102M (estimated)
All parameters are in 32-bit floating point (F32) format

Input Processing

Uses BERT tokenization
Supports sequences up to 512 tokens

Output

4-class multi-label classification

Performance Metrics

Accuracy score: 0.952902
F1 score (Micro): 0.968717
F1 score (Macro): 0.970818

Training Dataset

This model was trained on the scfengv/TVL-general-layer-dataset.

Testing Dataset

scfengv/TVL-general-layer-dataset
- validation
- Remove Emoji
- Emoji2Desc
- Remove Punctuation

Usage

import torch
from transformers import BertForSequenceClassification, BertTokenizer

model = BertForSequenceClassification.from_pretrained("scfengv/TVL_GeneralLayerClassifier")
tokenizer = BertTokenizer.from_pretrained("scfengv/TVL_GeneralLayerClassifier")

# Prepare your text
text = "Your text here" ## Please refer to Dataset
inputs = tokenizer(text, return_tensors = "pt", padding = True, truncation = True, max_length = 512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Print predictions
print(predictions)

Additional Notes

This model is specifically designed for TVL general layer classification tasks.
It's based on the Chinese BERT model, indicating it's optimized for Chinese text.
Hardware Type: NVIDIA Quadro RTX8000
Library: PyTorch
Hours used: 2hr 56mins

Training Data

scfengv/TVL-general-layer-dataset
- train

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Evaluation

Testing Data

scfengv/TVL-general-layer-dataset
- validation
- Remove Emoji
- Emoji2Desc
- Remove Punctuation

Results (validation)

Accuracy: 0.952902
F1 Score (Micro): 0.968717
F1 Score (Macro): 0.970818

scfengv
/

TVL_GeneralLayerClassifier

Model Details of TVL_GeneralLayerClassifier

Base Model

Model Architecture

Key Components

Training Hyperparameters

Training Infrastructure

Model Parameters

Input Processing

Output

Performance Metrics

Training Dataset

Testing Dataset

Usage

Additional Notes

Training Data

Training Hyperparameters

Evaluation

Testing Data

Results (validation)

Model tree for scfengv/TVL_GeneralLayerClassifier

Dataset used to train scfengv/TVL_GeneralLayerClassifier

Evaluation results