crumb commited on
Commit
adc8d42
·
1 Parent(s): 4c073a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -6
README.md CHANGED
@@ -65,12 +65,6 @@ Nearly every base model that isn't finetuned for a specific task was trained on
65
 
66
  ```
67
 
68
- "Instruct" models have these special tokens:
69
-
70
- ```
71
- <prompt> your prompt goes here <output> the model outputs a result here.
72
- ```
73
-
74
  Some applications where I can imagine these being useful are: warm-starting very small encoder-decoder models, fitting a new scaling law that takes into account smaller models, or having a "fuzzy wrapper" around an API. They also could be usable on their own (for classification or other) when finetuned on more specific datasets. I don't expect the 3.3m models to be useful for any task whatsoever. Every model was trained on a singular GPU, either a RTX2060, RTX3060, or a T4.
75
 
76
  I'd , uh , appreciate help in evaluating all these models probably with lm harness!!
 
65
 
66
  ```
67
 
 
 
 
 
 
 
68
  Some applications where I can imagine these being useful are: warm-starting very small encoder-decoder models, fitting a new scaling law that takes into account smaller models, or having a "fuzzy wrapper" around an API. They also could be usable on their own (for classification or other) when finetuned on more specific datasets. I don't expect the 3.3m models to be useful for any task whatsoever. Every model was trained on a singular GPU, either a RTX2060, RTX3060, or a T4.
69
 
70
  I'd , uh , appreciate help in evaluating all these models probably with lm harness!!