Difference between RoBERTa-base and RoBERTa?
Hi all,
I'm currently conducting some NLP research and i'm trying to understand the difference between RoBERTa (https://huggingface.co/docs/transformers/model_doc/roberta) and RoBERTa-base (https://huggingface.co/roberta-base).
I've read several pages online but it's still not very clear. It seems as though RoBERTa-base is just a RoBERTa model with default configuration?
Could someone advise please?
Thanks!
Hi!
RoBERTa (https://huggingface.co/docs/transformers/model_doc/roberta) is the architecture, while RoBERTa-base (https://huggingface.co/roberta-base) is one particular checkpoint using this architecture.
An architecture + a checkpoint constitute a "model" (the term model is a bit ambiguous)
A good doc for this is in the course: https://huggingface.co/course/chapter1/4?fw=pt#architectures-vs-checkpoints
Hope this helps!