Appreciate the model drop!

#1
by Nitral-AI - opened

But why is it only 4k? Its 2024 man, those are rookie numbers.

IBM Granite org

Hi @Nitral-AI , thanks for your interest in these new Granite models. You're right that 4k is a short context window by today's standards. Long context is coming soon (expected to be ready by the end of the year).

Training longer context into the models is sequential, so in order to train longer context lengths, we first needed to train for shorter lengths. These 3.0 short-context models are useful for a number of use cases. We opted to release them as-is with 4k context so that users can start with them now if they fit the use case. For long context (128k), keep an eye out for the upcoming 3.1 drop!

@gabegoodhart

Code version too, i hope ?

keep an eye out for the upcoming 3.1 drop!

IBM Granite org

@Nurb4000 Yes! These models are already trained with code-centric capabilities, but due to their short context windows we don't recommend moving off of granite-code (yet!)

Sign up or log in to comment