Knowledge Continuity Regularized Network

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42

Regularization Hyperparameters

  • numerical stability denominator constant = 0.01
  • lambda = 10.0
  • alpha = 2.0
  • beta = 1.0

Extended Logs:

eval_loss eval_accuracy epoch
20.618 0.909 1.0
31.373 0.895 2.0
22.366 0.908 3.0
25.879 0.905 4.0
24.415 0.919 5.0
21.310 0.914 6.0
19.569 0.910 7.0
22.761 0.918 8.0
21.308 0.916 9.0
15.565 0.926 10.0
23.619 0.923 11.0
20.479 0.898 12.0
17.252 0.924 13.0
18.431 0.918 14.0
17.298 0.925 15.0
19.347 0.928 16.0
20.198 0.928 17.0
28.221 0.927 18.0
24.462 0.925 19.0
14.300 0.927 20.0
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support