task vector optimization checkpoint ready for merging.

trained on MFANN for 12000 steps, however due to a slightly higher training loss, im going to merge this model with the last version and retrain, the goal was to use DARE-TIES to reduce the parameters used per vector, and this model will now be merged with the last model before DARE using TIES alone, and will be subsequently retrained.

Downloads last month
14
Safetensors
Model size
2.78B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for netcat420/MFANN3bv0.16.11

Merges
3 models

Dataset used to train netcat420/MFANN3bv0.16.11