--- license: apache-2.0 pipeline_tag: tabular-regression --- # TabPFNMix Regressor TabPFNMix regressor is a tabular foundation model that is pre-trained on purely synthetic datasets sampled from a mix of random regressors. ## Architecture TabPFNMix is based on a 12-layer encoder-decoder Transformer of 37 M parameters. We use a pre-training strategy incorporating in-context learning, similar to that used by TabPFN and TabForestPFN. ## Usage To use TabPFNMix regressor, install AutoGluon by running: ```sh pip install autogluon ``` A minimal example showing how to perform fine-tuning and inference using TabPFNMix regressor ```python import pandas as pd from autogluon.tabular import TabularPredictor if __name__ == '__main__': train_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv') subsample_size = 5000 if subsample_size is not None and subsample_size < len(train_data): train_data = train_data.sample(n=subsample_size, random_state=0) test_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv') tabpfnmix_default = { "model_path_classifier": "autogluon/tabpfn-mix-1.0-classifier", "model_path_regressor": "autogluon/tabpfn-mix-1.0-regressor", "n_ensembles": 1, "max_epochs": 30, } hyperparameters = { "TABPFNMIX": [ tabpfnmix_default, ], } label = "age" problem_type = "regression" predictor = TabularPredictor( label=label, problem_type=problem_type, ) predictor = predictor.fit( train_data=train_data, hyperparameters=hyperparameters, verbosity=3, ) predictor.leaderboard(test_data, display=True) ``` ## Citation If you find TabPFNMix useful for your research, please consider citing the associated papers: ``` @article{erickson2020autogluon, title={Autogluon-tabular: Robust and accurate automl for structured data}, author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander}, journal={arXiv preprint arXiv:2003.06505}, year={2020} } @article{hollmann2022tabpfn, title={Tabpfn: A transformer that solves small tabular classification problems in a second}, author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank}, journal={arXiv preprint arXiv:2207.01848}, year={2022} } @article{breejen2024context, title={Why In-Context Learning Transformers are Tabular Data Classifiers}, author={Breejen, Felix den and Bae, Sangmin and Cha, Stephen and Yun, Se-Young}, journal={arXiv preprint arXiv:2405.13396}, year={2024} } ``` ## License This project is licensed under the Apache-2.0 License.