Text Generation
Transformers
PyTorch
Safetensors
English
gpt_refact
code
custom_code
Eval Results

Update README.md

#1
by katek - opened
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -566,8 +566,7 @@ language:
566
 
567
  # Refact-1.6B
568
 
569
- Finally, the model we started training with our blog post
570
- [Applying Recent Innovations](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready ๐ŸŽ‰
571
 
572
  After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
573
  StarCoder ten times the size!
@@ -614,7 +613,7 @@ Filtering is the key to success of this model:
614
  The text to code proportion was 50:50, model trained for 1.2T tokens.
615
 
616
  We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
617
- its practical use is limited. But if you still want it, write us a message on discord.
618
 
619
 
620
  # Finetuning
@@ -633,7 +632,7 @@ The former is likely finished, so the model tries to come up with a suggestion t
633
  You are likely to have half-written code as you work on it, there is no single addition that can repair it
634
  fully.
635
 
636
- In practice, model needs to have a tendency to stop after a couple of lines added, and sometimes don't write
637
  anything at all. We found that just giving it empty completions, single line completions, multiline
638
  completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
639
  was used as the rest 85% of the finetune dataset.
 
566
 
567
  # Refact-1.6B
568
 
569
+ Finally, the model we started training with our [blog post](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready ๐ŸŽ‰
 
570
 
571
  After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
572
  StarCoder ten times the size!
 
613
  The text to code proportion was 50:50, model trained for 1.2T tokens.
614
 
615
  We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
616
+ its practical use is limited. But if you still want it, write us a message on Discord.
617
 
618
 
619
  # Finetuning
 
632
  You are likely to have half-written code as you work on it, there is no single addition that can repair it
633
  fully.
634
 
635
+ In practice, model needs to have a tendency to stop after a couple of lines are added, and sometimes don't write
636
  anything at all. We found that just giving it empty completions, single line completions, multiline
637
  completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
638
  was used as the rest 85% of the finetune dataset.