make xformers an optional dependency

#6
by NyxKrage - opened

This adapts the LlamaMLP from the llama modeling code in transformers to handle splitting the w12 weight during the forward pass, and uses it in case xformers is not available on the system.

This enables the model to be used on MacOS for example.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment