DeepSeek R1 Medical Reasoning

This model was fine-tuned for medical reasoning using Unsloth and Huggingface's TRL library, achieving 2x faster training.

Model Details

  • Fine-tuning task: Medical reasoning with step-by-step chain-of-thought explanations
  • Training dataset: Medical reasoning dataset (500 examples)
  • Training metrics:
    • Final loss: 1.3269
    • Training runtime: 2191.2041 seconds
    • Total FLOPs: 4.01e+16
    • Epochs completed: 1.896
Downloads last month
0
Safetensors
Model size
4.74B params
Tensor type
FP16
·
F32
·
U8
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.