Vedant101
/

fine-tune-deep-seek-r1

clinical-reasoning

Model card Files Files and versions Community

Fine-tuned DeepSeek R1 Model for Medical Reasoning

This model is a fine-tuned version of DeepSeek R1 specialized for medical reasoning and clinical decision-making.

Training Details

Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
Training Data: Medical reasoning dataset (FreedomIntelligence/medical-o1-reasoning-SFT)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training Parameters:
- Batch Size: 2
- Learning Rate: 2e-4
- Epochs: 1
- Max Sequence Length: 2048

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Vedant101/fine-tune-deep-seek-r1")
tokenizer = AutoTokenizer.from_pretrained("Vedant101/fine-tune-deep-seek-r1")

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Dataset used to train Vedant101/fine-tune-deep-seek-r1