tokyotech-llm/swallow-code-v2
Viewer • Updated • 147M • 64.6k • 38
Writeup: https://dudeperf3ct.github.io/projects/train_llm_part2/
Repo: https://github.com/dudeperf3ct/minicode-llm/tree/main/codellm_pretrain/torch_titan
This contains checkpoints every 5k steps for pretraining run 9.8B tokens using
tokyotech-llm/swallow-code-v2The repository contains detailed step on how to run evaluation using PyTorch DCP checkpoints.
Base model
dudeperf3ct/codellm-tokenizer