Towards Best Practices for Open Datasets for LLM Training
Paper
•
2501.08365
•
Published
•
53
None defined yet.
float16
. However, there's some precision loss somewhere and generation doesn't work in float16
mode yet. I'm looking into this and will keep you posted! Or take a look at this issue if you'd like to help: https://github.com/huggingface/swift-transformers/issues/95mamba
is now available in transformers. Thanks to
@tridao
and
@albertgu
for this brilliant model! 🚀 and the amazing mamba-ssm
kernels powering this!