Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
fla-hub
/
gsa-1.3B-100B
like
0
Follow
fla-hub
18
Text Generation
Safetensors
cerebras/SlimPajama-627B
English
fla
gsa
arxiv:
2409.07146
License:
mit
Model card
Files
Files and versions
Community
1
Model of the paper
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
.
Downloads last month
27
Safetensors
Model size
1.38B params
Tensor type
BF16
·
Inference Examples
Text Generation
Inference API (serverless) does not yet support fla models for this pipeline type.
Dataset used to train
fla-hub/gsa-1.3B-100B
cerebras/SlimPajama-627B
Preview
•
Updated
Jul 7, 2023
•
45.8k
•
437
Collection including
fla-hub/gsa-1.3B-100B
GSA
Collection
3 items
•
Updated
Nov 10
•
2