can we have some more details?

#1
by Tom-Neverwinter - opened

What are the changes? would love to hear more about this version

Hmm sorry, I don't have exact details for this release. Let me see...

  • Finetuned with more care in retaining intelligence
  • Removed examples written after 2022 (Filtered out 30% of the dataset)
  • Removed examples containing slop like "shivers" (Filtered out 20% afterwards) (I overlooked "chills" so you might see a lot of it)
  • Removed examples shorter than 512 tokens (Filtered out another 20%)

I'm pretty sure the major pruning skewed the genre stats but I haven't checked

Hmm sorry, I don't have exact details for this release. Let me see...

  • Finetuned with more care in retaining intelligence
  • Removed examples written after 2022 (Filtered out 30% of the dataset)
  • Removed examples containing slop like "shivers" (Filtered out 20% afterwards) (I overlooked "chills" so you might see a lot of it)
  • Removed examples shorter than 512 tokens (Filtered out another 20%)

I'm pretty sure the major pruning skewed the genre stats but I haven't checked

Сan this model be improved by extending its maximum context to 16-32?

Sign up or log in to comment