Model Card: Assisting Mathematical Formalization with A Learning-based Premise Retriever

Model Description

This model is designed for premise retrieval in Lean, based on the state representation of Lean. The model follows the architecture described in the paper:

Assisting Mathematical Formalization with A Learning-based Premise Retriever

The model implementation and code are available at:

GitHub Repository

Try our model


Available Models

The repository provides several models for different tasks:

Pre-trained Models

Located in the Pretrain_Model folder:

  • 410_stable_random: Pre-trained model for retrieval fine-tuning.
  • 410_stable_random_1024: Pre-trained model for reranking fine-tuning.

Fine-tuned Models

Retrieval Fine-tuned Models

Located in the Finetune_Model folder:

  • Models fine-tuned on specific datasets for retrieval tasks.

Rerank Fine-tuned Models

Located in the Rerank folder:

  • Models fine-tuned on specific datasets for reranking tasks.

Citation

If you use this model, please cite the following paper:

@misc{tao2025assistingmathematicalformalizationlearningbased,
      title={Assisting Mathematical Formalization with A Learning-based Premise Retriever}, 
      author={Yicheng Tao and Haotian Liu and Shanwen Wang and Hongteng Xu},
      year={2025},
      eprint={2501.13959},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.13959}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support sentence-similarity models for transformers library.