Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Paper โข 2404.07647 โข Published Apr 11 โข 4
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. โข 121 items โข Updated Jan 31 โข 506