Best way to run on apple silicon
Are you planning to add Apple Silicon support?
Are there any workarounds to be able to run it on Apple Silicon in the short term?
While AI21 does not have any near term plans to add Apple Silicon support, there are folks in the community that are working towards that: https://github.com/ml-explore/mlx
While AI21 does not have any near term plans to add Apple Silicon support, there are folks in the community that are working towards that: https://github.com/ml-explore/mlx
@AI21Nick
Are there any good work arounds for me to be able to run it locally at this time?
@hudsongouge while I'm not aware of any workarounds, HF Transformers can be used to load Jamba Mini, however it will require more than 80GB of GPU memory (2x GPU's) because it runs in BF16: https://huggingface.co/ai21labs/AI21-Jamba-Mini-1.6#run-the-model-with-transformers
But if you scroll down a bit further on the model card where Transformers is mentioned, it explains how to run in 8bit on a single 80GB GPU.
I mention HF Transformers because there are methods out there to load it on Apple Silicon, such as this one: https://medium.com/@faizififita1/huggingface-installation-on-apple-silicon-2022-m1-pro-max-ultra-m2-9c449b9b4c14, but there is still no guarantee that Jamba can run like this, in addition to it needing at least 80GB of memory just to run in 8bit.
@AI21Nick Thanks for the help. I have 36GB and I can generally run models of this size at 4bit. I'll have to see if I can find a workaround to run it.
@AI21Nick
Transformers works and runs great on Apple Silicon. However, certain models such as Jamba have dependancies and features that are not yet supported for certain platforms.
@hudsongouge you are correct, that's why I was saying it may be possible, but there are still some compatibility issues that need to be worked through. I've attempted recently but didn't have any luck.