YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
⚠Warning⚠ this is an experimental weight. It may not have practical performance.
Also, the model file must be manually rewritten or replaced to use this weight.
The model file is available here.
https://github.com/lucidrains/BS-RoFormer
The BS-Roformer has been updated in terms of architecture for the first time in a while.
In the 0.5.x update, a mechanism called "Value Residual Learning" was introduced. (https://arxiv.org/abs/2410.17897)
The paper argues that this mechanism can reduce the over-focus of attention and further reduce the vanishing gradient problem.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.