matthewleechen
/

multilabel_patent_classifier

@@ -1,77 +1,114 @@
 ---
 library_name: transformers
-license: mit
-base_model: FacebookAI/xlm-roberta-large
-tags:
-- generated_from_trainer
-model-index:
-- name: multiclass-classifier-patents
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# multiclass-classifier-patents
-This model is a fine-tuned version of [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0067
-- F1 Micro: 0.7001
-- Precision Micro: 0.8337
-- Recall Micro: 0.6034
-- Exact Match F1: 0.5296
-- Exact Match Precision: 0.5296
-- Exact Match Recall: 0.5296
-- Any Match F1: 0.9079
-- Any Match Precision: 0.9079
-- Any Match Recall: 0.9079
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 64
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- num_epochs: 10
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch | Step  | Validation Loss | F1 Micro | Precision Micro | Recall Micro | Exact Match F1 | Exact Match Precision | Exact Match Recall | Any Match F1 | Any Match Precision | Any Match Recall |
-|:-------------:|:-----:|:-----:|:---------------:|:--------:|:---------------:|:------------:|:--------------:|:---------------------:|:------------------:|:------------:|:-------------------:|:----------------:|
-| 0.01          | 1.0   | 1292  | 0.0083          | 0.5977   | 0.8265          | 0.4681       | 0.4300         | 0.4300                | 0.4300             | 0.7675       | 0.7675              | 0.7675           |
-| 0.0077        | 2.0   | 2584  | 0.0074          | 0.6595   | 0.8326          | 0.5460       | 0.4879         | 0.4879                | 0.4879             | 0.8636       | 0.8636              | 0.8636           |
-| 0.007         | 3.0   | 3876  | 0.0071          | 0.6829   | 0.8173          | 0.5864       | 0.5035         | 0.5035                | 0.5035             | 0.8958       | 0.8958              | 0.8958           |
-| 0.0063        | 4.0   | 5168  | 0.0069          | 0.6883   | 0.8317          | 0.5871       | 0.5140         | 0.5140                | 0.5140             | 0.8956       | 0.8956              | 0.8956           |
-| 0.0058        | 5.0   | 6460  | 0.0068          | 0.6957   | 0.8337          | 0.5969       | 0.5182         | 0.5182                | 0.5182             | 0.9058       | 0.9058              | 0.9058           |
-| 0.0053        | 6.0   | 7752  | 0.0069          | 0.6999   | 0.8366          | 0.6017       | 0.5271         | 0.5271                | 0.5271             | 0.9082       | 0.9082              | 0.9082           |
-| 0.0048        | 7.0   | 9044  | 0.0069          | 0.7046   | 0.8159          | 0.6201       | 0.5225         | 0.5225                | 0.5225             | 0.9185       | 0.9185              | 0.9185           |
-| 0.0046        | 8.0   | 10336 | 0.0069          | 0.7069   | 0.8100          | 0.6271       | 0.5241         | 0.5241                | 0.5241             | 0.9196       | 0.9196              | 0.9196           |
-| 0.0042        | 9.0   | 11628 | 0.0070          | 0.7064   | 0.8208          | 0.6200       | 0.5282         | 0.5282                | 0.5282             | 0.9174       | 0.9174              | 0.9174           |
-| 0.004         | 10.0  | 12920 | 0.0070          | 0.7064   | 0.8184          | 0.6214       | 0.5276         | 0.5276                | 0.5276             | 0.9177       | 0.9177              | 0.9177           |
-### Framework versions
-- Transformers 4.45.2
-- Pytorch 2.0.1+cu117
-- Datasets 3.0.1
-- Tokenizers 0.20.3

 ---
+language:
+- en
+base_model:
+- FacebookAI/xlm-roberta-large
+pipeline_tag: text-classification
 library_name: transformers
 ---
+# Patent Classification Model
+### Model Description
+**multilabel_patent_classifier** is a fine-tuned [XLM-RoBERTa-large](https://huggingface.co/FacebookAI/xlm-roberta-large) model that has been trained on patent class information between 1855-1883 made available [here](http://walkerhanlon.com/data_resources/british_patent_classification_database.zip).
+It has been trained to recognize 146 classes of named entities outlined by the British Patent Office. These are made available [here](https://huggingface.co/matthewleechen/multiclass-classifier-patents/edit/main/BPO_classes.csv).
+We take the original xlm-roberta-large [weights](https://huggingface.co/FacebookAI/xlm-roberta-large/blob/main/pytorch_model.bin) and fine tune on our custom dataset for 10 epochs with a learning rate of 2e-05 and a batch size of 64.
+### Usage
+This model can be used with HuggingFace Transformer's Pipelines API for NER:
+```python
+from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
+model_name = "matthewleechen/multilabel_patent_classifier"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForTokenClassification.from_pretrained(model_name)
+pipe = pipeline(
+  task="text-classification",
+  model=model,
+  device = 0,
+  tokenizer=tokenizer,
+  return_all_scores=True
+)
+```
+### Training Data
+Our training data consists of patent titles labelled with 0-1 tags for each patent class. Labels were generated by the British Patent Office between 1855-1883 and our patent titles were extracted from the front pages of our specification texts using a patent title NER [model](https://huggingface.co/matthewleechen/patent_titles_ner).
+### Training Procedure
+We use the standard multi-label classification protocols with the HuggingFace Trainer API, but replace the default `BCEWithLogitsLoss` with a [focal loss](https://arxiv.org/pdf/1708.02002) function (α=1, γ=2) to address class imbalance. Both during evaluation and at inference, we apply a sigmoid to each logit and use a 0.5 threshold to determine positive labels for each class.
+### Evaluation
+We compute precision, recall, and F1 for each class (with a 0.5 sigmoid threshold), as well as exact match (only if ground truth and predicted classes are identical) and any match (if any overlap between ground truth and predicted classes) percentages.
+These scores are aggregated for the test set below.
+<table>
+  <thead>
+    <tr>
+      <th>Metric Type</th>
+      <th>Precision (Micro)</th>
+      <th>Recall (Micro)</th>
+      <th>F1 (Micro)</th>
+      <th>Exact Match</th>
+      <th>Any Match</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Micro Average</td>
+      <td>83.4%</td>
+      <td>60.3%</td>
+      <td>70.0%</td>
+      <td>52.9%</td>
+      <td>90.8%</td>
+    </tr>
+  </tbody>
+</table>
+## References
+```bibtex
+@misc{hanlon2016,
+  title = {{British Patent Technology Classification Database: 1855–1882}},
+  author = {Hanlon, Walker},
+  year = {2016},
+  url = {http://www.econ.ucla.edu/whanlon/},
+  note = {Available at: \url{http://www.econ.ucla.edu/whanlon/}}
+}
+@misc{lin2018focallossdenseobject,
+  title={Focal Loss for Dense Object Detection},
+  author={Tsung-Yi Lin and Priya Goyal and Ross Girshick and Kaiming He and Piotr Dollár},
+  year={2018},
+  eprint={1708.02002},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV},
+  url={https://arxiv.org/abs/1708.02002},
+}
+```
+## Citation
+If you use our model in your research, please cite our accompanying paper as follows:
+```bibtex
+@article{bct2025,
+  title = {300 Years of British Patents},
+  author = {Enrico Berkes and Matthew Lee Chen and Matteo Tranchero},
+  journal = {arXiv preprint arXiv:2401.12345},
+  year = {2025},
+  url = {https://arxiv.org/abs/2401.12345}
+}
+```