PSA: HF transformers implementation open sourced (with Trainer support)

#39
by thomasgauthier - opened

Hi everyone,

First I'd like to thank the Sesame team for this amazing model.

I want to share with the community something I've been working on: a re-implementation of the model for HuggingFace transformers, fully compatible with Trainer.

It supports decoder training amortization like presented in the CSM blog post. It also supports generation.

It is Apache 2.0 licensed, the code can be found here: https://github.com/thomasgauthier/csm-hf
The converted pretrained model weights are hosted at https://huggingface.co/thomasgauthier/csm-1b-hf

Looking forward to see what the community will do with this.

🀞

thomasgauthier changed discussion title from PSA: HF transformers implementation open sourced to PSA: HF transformers implementation open sourced (with Trainer support)

This is great work. This is a true implementation of a HF model, @thomasgauthier ! Will be recommending and using this. Thanks!

Hope that more people build on top of your framework :)

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment