kn-eng-prop-m-nm / README.md
96abhishekarora's picture
Add new LinkTransformer model.
88fe6bc
|
raw
history blame
2.13 kB
metadata
pipeline_tag: text-classification
language:
  - multilingual
tags:
  - linktransformer
  - transformers
  - text-classification
  - tabular-classification

96abhishekarora/kn-eng-prop-m-nm

This model is part of the LinkTransformer ecosystem. While rooted in the a standard HuggingFace Transformer, this specific instance is tailored for text classification tasks. It classifies input sentences or paragraphs into specific categories or labels, leveraging the power of transformer architectures.

The base model for this classifier is: bert. It is pretrained for the language: - multilingual.

Labels are mapped to integers as follows:

{LABEL_MAP}

For best results, append ಆಸ್ತಿ ಮಾಲೀಕನ ಹೆಸರು to the name

Usage with LinkTransformer

After installing LinkTransformer:

pip install -U linktransformer

Employ the model for text classification tasks:

import linktransformer as lt
df_clf_output = lt.classify_rows(df, on=["col_of_interest"], model="96abhishekarora/kn-eng-prop-m-nm")

Training

Training your own LinkTransformer Classification Model

With the provided tools, you can train a custom classification model:

from linktransformer import train_clf_model

best_model_path, best_metric, label_map = train_clf_model(
    data="path_to_dataset.csv",
    model="you-model-path-or-name",
    on=["col_of_interest"],
    label_col_name="label_column_name",
    lr=5e-5,
    batch_size=16,
    epochs=3
)

Evaluation Results

Evaluation is typically based on metrics like accuracy, F1-score, precision, and recall.

Citing & Authors

@misc{arora2023linktransformer,
                  title={LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models},
                  author={Abhishek Arora and Melissa Dell},
                  year={2023},
                  eprint={2309.00789},
                  archivePrefix={arXiv},
                  primaryClass={cs.CL}
}