Victor Hall

panopstor

AI & ML interests

data engineering, production infrastructure, product development

Recent Activity

Organizations

Panopstor LLC's profile picture

panopstor's activity

replied to appliedml42's post about 2 months ago
view reply

Repeating some random samples of the original training data during training would be the typical answer, but unfortunately the original data is not clearly available. So instead, you might select a publicly available dataset that is likely to cover some of the problem space and choose to sample from it for X% of the training samples where X might be something like 5-50%. Even a small rate may have a significant positive affect.

Your problem is "forgetting" and that is expected when training at a small scale and also with parameter-efficient methods. Parameter efficient methods like LORA are probably best thought of as cost-effective specialization vs. full unfrozen fine tuning. A loss in performance in generality is expected.

Nevertheless, repeating some of the original dataset or at least some sort of instruction-following data may still help you balance the trade off of fine tuning for your specific task (performance on rotten tomatoes related tasks) vs forgetting (performance on ifeval).

Another option might be to create additional synthetic samples via a larger model (i.e. Claude, Llama 70B, etc) based on a given rotten-tomatoes dataset sample, and integrate that again as some % portion of your training set. I.e. Ask ChatGPT to craft an instruction-response pair given a rotten-tomatoes data sample. Even just a few hundred (small % compared to the size of rotten-tomatoes dataset) might help quite a bit.

New activity in panopstor/auraflow_catreject 6 months ago
New activity in xtuner/llava-llama-3-8b-v1_1-transformers 9 months ago

Controlling length

#1 opened 9 months ago by
panopstor
New activity in microsoft/kosmos-2-patch14-224 11 months ago