Appreciate the model drop!

#6
by Nitral-AI - opened

But why is it only 4k? Its 2025 man, those are rookie numbers.

Language Technologies Unit @ Barcelona Supercomputing Center org
edited 1 day ago

We understand the demand for longer context windows and our roadmap includes multiple possible approaches to increase it. Extending the context length involves trade-offs in training efficiency, memory usage, and model performance, we are working on how to do it as efficient as possible.

If you now need a model with a longer context, consider using our instructed Salamandra-7b, it might be more suitable for you.

Sign up or log in to comment