Safetensors

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

This repository contains the models and datasets used in the paper "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines".

Models

The ckpt folder contains 16 LoRA adapters that were fine-tuned for this research:

  • 6 Basic Executors
  • 3 Executor Composers
  • 7 Aligners

The base model used for fine-tuning all of the above is LLaMA 3.1-8B.

Datasets

The datasets used for evaluating all models can be found in the datasets/raw folder.

Usage

Please refer to GitHub page for details.

Citation

If you use CAEF for your research, please cite our paper:

@misc{lai2024executing,
      title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines}, 
      author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu},
      year={2024},
      eprint={2410.07896},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.07896}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.