FusOn-pLM / fuson_plm /README.md
svincoff's picture
dependencies and embedding_exploration benchmark
c43fbc6

Dependencies

Here we provied package versions needed to run FusOn-pLM code. For the project, Docker containers were used. We provide a pip list of what is inside the Docker container, as well as the images used for our containers.

pip installs

The following dependencies were used for all training and benchmarking except for the puncta benchmarks. Note that after cloning the repository, you will need to run pip install e . outside the fuson_plm directory to install fuson_plm package.

Package Version Editable project location


absl-py 1.4.0 aiohttp 3.8.4 aiosignal 1.3.1 apex 0.1 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asttokens 2.2.1 astunparse 1.6.3 async-timeout 4.0.2 attrs 23.1.0 audioread 3.0.0 backcall 0.2.0 beautifulsoup4 4.12.2 bio 1.7.1 biopython 1.84 biothings-client 0.3.1 bleach 6.0.0 blis 0.7.10 cachetools 5.3.1 catalogue 2.0.9 certifi 2023.7.22 cffi 1.15.1 charset-normalizer 3.2.0 click 8.1.5 cloudpickle 2.2.1 cmake 3.27.1 comm 0.1.4 confection 0.1.1 contourpy 1.1.0 cubinlinker 0.3.0+2.g7c3675e cuda-python 12.1.0rc5+1.g994d8d0 cudf 23.6.0 cugraph 23.6.0 cugraph-dgl 23.6.0 cugraph-service-client 23.6.0 cugraph-service-server 23.6.0 cuml 23.6.0 cupy-cuda12x 12.1.0 cycler 0.11.0 cymem 2.0.7 Cython 3.0.0 dask 2023.3.2 dask-cuda 23.6.0 dask-cudf 23.6.0 debugpy 1.6.7 decorator 5.1.1 defusedxml 0.7.1 distributed 2023.3.2.1 dm-tree 0.1.8 docker-pycreds 0.4.0 einops 0.6.1 exceptiongroup 1.1.2 execnet 2.0.2 executing 1.2.0 expecttest 0.1.3 fair-esm 2.0.0 fastjsonschema 2.18.0 fastrlock 0.8.1 filelock 3.12.2 flash-attn 2.0.4 fonttools 4.42.0 frozenlist 1.4.0 fsspec 2023.6.0 fuson-plm 1.0 /workspace/FusOn-pLM gast 0.5.4 gdown 5.2.0 gitdb 4.0.11 GitPython 3.1.43 google-auth 2.22.0 google-auth-oauthlib 0.4.6 gprofiler-official 1.0.0 graphsurgeon 0.4.6 grpcio 1.56.2 huggingface-hub 0.25.2 hypothesis 5.35.1 idna 3.4 importlib-metadata 6.8.0 iniconfig 2.0.0 intel-openmp 2021.4.0 ipykernel 6.25.0 ipython 8.14.0 ipython-genutils 0.2.0 jedi 0.19.0 Jinja2 3.1.2 joblib 1.3.1 json5 0.9.14 jsonschema 4.18.6 jsonschema-specifications 2023.7.1 jupyter_client 8.3.0 jupyter_core 5.3.1 jupyter-tensorboard 0.2.0 jupyterlab 2.3.2 jupyterlab-pygments 0.2.2 jupyterlab-server 1.2.0 jupytext 1.15.0 kiwisolver 1.4.4 langcodes 3.3.0 librosa 0.9.2 lightning-utilities 0.11.8 llvmlite 0.40.1 locket 1.0.0 Markdown 3.4.4 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.7.2 matplotlib-inline 0.1.6 mdit-py-plugins 0.4.0 mdurl 0.1.2 mistune 3.0.1 mkl 2021.1.1 mkl-devel 2021.1.1 mkl-include 2021.1.1 mock 5.1.0 mpmath 1.3.0 msgpack 1.0.5 multidict 6.0.4 murmurhash 1.0.9 mygene 3.2.2 nbclient 0.8.0 nbconvert 7.7.3 nbformat 5.9.2 nest-asyncio 1.5.7 networkx 2.6.3 ninja 1.11.1 notebook 6.4.10 numba 0.57.1+1.gc785c8f1f numpy 1.22.2 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-dali-cuda120 1.28.0 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 nvidia-pyindex 1.0.9 nvtx 0.2.5 oauthlib 3.2.2 onnx 1.14.0 opencv 4.7.0 packaging 23.1 pandas 1.5.2 pandocfilters 1.5.0 parso 0.8.3 partd 1.4.0 pathy 0.10.2 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.2.0 pip 23.2.1 platformdirs 3.10.0 pluggy 1.2.0 ply 3.11 polygraphy 0.47.1 pooch 1.7.0 preshed 3.0.8 prettytable 3.8.0 prometheus-client 0.17.1 prompt-toolkit 3.0.39 protobuf 4.21.12 psutil 5.9.4 ptxcompiler 0.8.1+1.g4a94326 ptyprocess 0.7.0 pure-eval 0.2.2 py3Dmol 2.4.0 pyarrow 11.0.0 pyasn1 0.5.0 pyasn1-modules 0.3.0 pybind11 2.11.1 pycocotools 2.0+nv0.7.3 pycparser 2.21 pydantic 1.10.12 Pygments 2.16.1 pylibcugraph 23.6.0 pylibcugraphops 23.6.0 pylibraft 23.6.0 pynndescent 0.5.13 pynvml 11.4.1 pyparsing 3.0.9 PySocks 1.7.1 pytest 7.4.0 pytest-flakefinder 1.1.0 pytest-rerunfailures 12.0 pytest-shard 0.1.2 pytest-xdist 3.3.1 python-dateutil 2.8.2 python-hostlist 1.23.0 pytorch-lightning 2.4.0 pytorch-quantization 2.1.2 pytz 2023.3 PyYAML 6.0.1 pyzmq 25.1.0 raft-dask 23.6.0 referencing 0.30.2 regex 2023.6.3 requests 2.31.0 requests-oauthlib 1.3.1 resampy 0.4.2 rmm 23.6.0 rpds-py 0.9.2 rsa 4.9 safetensors 0.4.5 scikit-learn 1.2.0 scipy 1.11.1 seaborn 0.13.2 Send2Trash 1.8.2 sentencepiece 0.2.0 sentry-sdk 2.16.0 setproctitle 1.3.3 setuptools 68.0.0 six 1.16.0 smart-open 6.3.0 smmap 5.0.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.4.1 spacy 3.6.0 spacy-legacy 3.0.12 spacy-loggers 1.0.4 sphinx-glpi-theme 0.3 srsly 2.4.7 stack-data 0.6.2 sympy 1.13.1 tabulate 0.9.0 tbb 2021.10.0 tblib 2.0.0 tensorboard 2.9.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorrt 8.6.1 terminado 0.17.1 thinc 8.1.10 threadpoolctl 3.2.0 thriftpy2 0.4.16 tinycss2 1.2.1 tokenizers 0.20.1 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 torch 2.5.0 torch-tensorrt 2.0.0.dev0 torchdata 0.7.0a0 torchmetrics 1.5.0 torchtext 0.16.0a0 torchvision 0.16.0a0 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 transformer-engine 0.11.0+3f01b4f transformers 4.45.2 treelite 3.2.0 treelite-runtime 3.2.0 triton 3.1.0 typer 0.9.0 types-dataclasses 0.6.6 typing_extensions 4.12.2 ucx-py 0.32.0 uff 0.6.9 umap-learn 0.5.6 urllib3 1.26.16 wandb 0.18.3 wasabi 1.1.2 wcwidth 0.2.6 webencodings 0.5.1 Werkzeug 2.3.6 wheel 0.41.1 xdoctest 1.0.2 xgboost 1.7.5 yarl 1.9.2 zict 3.0.0 zipp 3.16.2

The following packages and versions were used for the puncta benchmarks. A different environment was required to run ProtT5.

Package Version Editable project location


absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 anyio 4.8.0 apex 0.1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 asttokens 2.4.1 astunparse 1.6.3 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 beautifulsoup4 4.12.3 bio 1.7.1 biopython 1.85 biothings_client 0.4.1 bleach 6.1.0 blis 0.7.11 cachetools 5.3.3 catalogue 2.0.10 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cloudpathlib 0.16.0 cloudpickle 3.0.0 cmake 3.29.0.1 comm 0.2.2 confection 0.1.4 contourpy 1.2.1 cuda-python 12.4.0rc7+3.ge75c8a9.dirty cudf 24.2.0 cudnn 1.1.2 cugraph 24.2.0 cugraph-dgl 24.2.0 cugraph-service-client 24.2.0 cugraph-service-server 24.2.0 cuml 24.2.0 cupy-cuda12x 13.0.0 cycler 0.12.1 cymem 2.0.8 Cython 3.0.10 dask 2024.1.1 dask-cuda 24.2.0 dask-cudf 24.2.0 debugpy 1.8.1 decorator 5.1.1 defusedxml 0.7.1 distributed 2024.1.1 dm-tree 0.1.8 docker-pycreds 0.4.0 einops 0.7.0 exceptiongroup 1.2.0 execnet 2.0.2 executing 2.0.1 expecttest 0.1.3 fair-esm 2.0.0 fastjsonschema 2.19.1 fastrlock 0.8.2 filelock 3.13.3 flash-attn 2.4.2 fonttools 4.51.0 frozenlist 1.4.1 fsspec 2024.2.0 fuson-plm 1.0 /workspace/FusOn-pLM gast 0.5.4 gdown 5.2.0 gitdb 4.0.12 GitPython 3.1.44 google-auth 2.29.0 google-auth-oauthlib 0.4.6 gprofiler-official 1.0.0 graphsurgeon 0.4.6 grpcio 1.62.1 h11 0.14.0 httpcore 1.0.7 httpx 0.28.1 huggingface-hub 0.27.1 hypothesis 5.35.1 idna 3.6 igraph 0.11.4 importlib_metadata 7.0.2 iniconfig 2.0.0 intel-openmp 2021.4.0 ipykernel 6.29.4 ipython 8.21.0 ipython-genutils 0.2.0 jedi 0.19.1 Jinja2 3.1.3 joblib 1.3.2 json5 0.9.24 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 jupyter_client 8.6.1 jupyter_core 5.7.2 jupyter-tensorboard 0.2.0 jupyterlab 2.3.2 jupyterlab_pygments 0.3.0 jupyterlab-server 1.2.0 jupytext 1.16.1 kiwisolver 1.4.5 langcodes 3.3.0 lark 1.1.9 lazy_loader 0.4 librosa 0.10.1 lightning-thunder 0.1.0 lightning-utilities 0.11.2 llvmlite 0.42.0 locket 1.0.0 looseversion 1.3.0 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.8.4 matplotlib-inline 0.1.6 mdit-py-plugins 0.4.0 mdurl 0.1.2 mistune 3.0.2 mkl 2021.1.1 mkl-devel 2021.1.1 mkl-include 2021.1.1 mock 5.1.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 murmurhash 1.0.10 mygene 3.2.2 nbclient 0.10.0 nbconvert 7.16.3 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 2.6.3 ninja 1.11.1.1 notebook 6.4.10 numba 0.59.0+1.g20ae2b56c numpy 1.24.4 nvfuser 0.1.6a0+a684e2a nvidia-dali-cuda120 1.36.0 nvidia-nvimgcodec-cu12 0.2.0.7 nvidia-pyindex 1.0.9 nvtx 0.2.5 oauthlib 3.2.2 onnx 1.16.0 opencv 4.7.0 opt-einsum 3.3.0 optree 0.11.0 packaging 23.2 pandas 1.5.3 pandocfilters 1.5.1 parso 0.8.4 partd 1.4.1 pexpect 4.9.0 pillow 10.2.0 pip 24.0 platformdirs 4.2.0 pluggy 1.4.0 ply 3.11 polygraphy 0.49.8 pooch 1.8.1 preshed 3.0.9 prettytable 3.10.0 prometheus_client 0.20.0 prompt-toolkit 3.0.43 protobuf 4.24.4 psutil 5.9.4 ptyprocess 0.7.0 pure-eval 0.2.2 py3Dmol 2.4.2 pyarrow 14.0.1 pyasn1 0.6.0 pyasn1_modules 0.4.0 pybind11 2.12.0 pybind11_global 2.12.0 pycocotools 2.0+nv0.8.0 pycparser 2.22 pydantic 2.6.4 pydantic_core 2.16.3 Pygments 2.17.2 pylibcugraph 24.2.0 pylibcugraphops 24.2.0 pylibraft 24.2.0 pynndescent 0.5.13 pynvjitlink 0.1.13 pynvml 11.4.1 pyparsing 3.1.2 PySocks 1.7.1 pytest 8.1.1 pytest-flakefinder 1.1.0 pytest-rerunfailures 14.0 pytest-shard 0.1.2 pytest-xdist 3.5.0 python-dateutil 2.9.0.post0 python-hostlist 1.23.0 pytorch-lightning 2.5.0.post0 pytorch-quantization 2.1.2 pytorch-triton 3.0.0+a9bc1a364 pytz 2024.1 PyYAML 6.0.1 pyzmq 25.1.2 raft-dask 24.2.0 rapids-dask-dependency 24.2.0a0 referencing 0.34.0 regex 2023.12.25 requests 2.31.0 requests-oauthlib 2.0.0 rich 13.7.1 rmm 24.2.0 rpds-py 0.18.0 rsa 4.9 safetensors 0.5.2 scikit-learn 1.2.0 scipy 1.12.0 seaborn 0.13.2 Send2Trash 1.8.2 sentencepiece 0.2.0 sentry-sdk 2.20.0 setproctitle 1.3.4 setuptools 68.2.2 six 1.16.0 smart-open 6.4.0 smmap 5.0.2 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.5 soxr 0.3.7 spacy 3.7.4 spacy-legacy 3.0.12 spacy-loggers 1.0.5 sphinx_glpi_theme 0.6 srsly 2.4.8 stack-data 0.6.3 sympy 1.12 tabulate 0.9.0 tbb 2021.12.0 tblib 3.0.0 tensorboard 2.9.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorrt 8.6.3 terminado 0.18.1 texttable 1.7.0 thinc 8.2.3 threadpoolctl 3.3.0 thriftpy2 0.4.17 tinycss2 1.2.1 tokenizers 0.21.0 toml 0.10.2 tomli 2.0.1 toolz 0.12.1 torch 2.3.0a0+6ddf5cf85e.nv24.4 torch-tensorrt 2.3.0a0 torchdata 0.7.1a0 torchmetrics 1.6.1 torchtext 0.17.0a0 torchvision 0.18.0a0 tornado 6.4 tqdm 4.66.2 traitlets 5.9.0 transformer-engine 1.5.0+6a9edc3 transformers 4.48.0 treelite 4.0.0 typer 0.9.4 types-dataclasses 0.6.6 typing_extensions 4.10.0 ucx-py 0.36.0 uff 0.6.9 umap-learn 0.5.7 urllib3 1.26.18 wandb 0.19.4 wasabi 1.1.2 wcwidth 0.2.13 weasel 0.3.4 webencodings 0.5.1 Werkzeug 3.0.2 wheel 0.43.0 xdoctest 1.0.2 xgboost 1.7.5 yarl 1.9.4 zict 3.0.0 zipp 3.17.0

Docker

The following image was used for Container 1 (all code except puncta benchmark):

nvcr.io/nvidia/pytorch:23.08-py3

The following image was used for Container 2 (puncta benchmark):

nvcr.io/nvidia/pytorch:24.04-py3