PSA: HF transformers implementation open sourced (with Trainer support)
Hi everyone,
First I'd like to thank the Sesame team for this amazing model.
I want to share with the community something I've been working on: a re-implementation of the model for HuggingFace transformers, fully compatible with Trainer
.
It supports decoder training amortization like presented in the CSM blog post. It also supports generation.
It is Apache 2.0 licensed, the code can be found here: https://github.com/thomasgauthier/csm-hf
The converted pretrained model weights are hosted at https://huggingface.co/thomasgauthier/csm-1b-hf
Looking forward to see what the community will do with this.
π€
This is great work. This is a true implementation of a HF model, @thomasgauthier ! Will be recommending and using this. Thanks!
Hope that more people build on top of your framework :)