Image-Text-to-Text
PEFT
Safetensors
English
File size: 4,465 Bytes
7f849e1
 
 
 
 
 
 
 
 
c872d15
7f849e1
 
 
 
 
 
 
 
dfd7bce
7f849e1
 
 
 
 
 
2b6a5b3
7f849e1
 
 
2b6a5b3
7f849e1
 
9466b58
7f849e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2892d82
7f849e1
 
 
2b6a5b3
7f849e1
 
 
50306c8
7f849e1
2b6a5b3
 
 
 
 
 
 
 
 
 
7f849e1
 
e6f9e1a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: apache-2.0
datasets:
- eltorio/ROCO-radiology
language:
- en
- fr
base_model:
- HuggingFaceM4/Idefics3-8B-Llama3
pipeline_tag: image-to-text
---

# IDEFICS3_ROCO

![Stage](https://img.shields.io/badge/stage-early%20development-yellow)![License](https://img.shields.io/badge/license-Apache%202.0-blue)![Contributors Welcome](https://img.shields.io/badge/contributors-welcome-brightgreen)[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

## A Fine-tuned Radiology-focused Model based on Hugging Face's Idefics3 Model

This repository contains a fine-tuned version of the Hugging Face [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) model, built on top of the Meta Llama 3.1 8B architecture. Our model, `IDEFICS3_ROCO`, has been fine-tuned on the [Radiology Objects in Context (ROCO)](https://huggingface.co/datasets/eltorio/ROCO-radiology) dataset, a large-scale medical and multimodal imaging collection.

### Model Information

* **Base Model:** Idefics3-8B-Llama3
* **Fine-tuning Dataset:** Radiology Objects in Context (ROCO)
* **License:** Apache-2.0
* **Current Status:** Fine-tuning process is currently halted at checkpoint 2350 (out of 12,267) (in branch bug-restart) due to limitations with Colab Free T4 GPU unit. Contributions to complete the fine-tuning process are welcome!

### Training Progress Status

* Current checkpoint: 2350/12267 (~19% completed) (in branch bug-restart)
* Estimated remaining GPU time: ~57 hours
* Hardware requirements: T4 GPU with >16GB VRAM
* Last update: november, 8th 2024

### Fine-tuning Code

The fine-tuning code is available as a Jupyter Notebook in the [ROCO-radiology dataset repository](https://huggingface.co/datasets/eltorio/ROCO-radiology) on Hugging Face:

* [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

The [Junyper Notebook](https://colab.research.google.com/#fileId=https%3A//huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) contains the code to fine-tune the Idefics3-8B-Llama3 model on the ROCO dataset. The fine-tuning process is currently halted at checkpoint 640 (out of 24,000) due to limitations with Colab Free T4 GPU unit. Contributions to complete the fine-tuning process are welcome!

### Contributions Welcome

If you have the resources to complete the fine-tuning process, we would appreciate your contribution. Please fork this repository, finish the fine-tuning process, and submit a pull request with your updates.

### Citation

If you use this model in your work, please cite the original Idefics3 model and our fine-tuned model:

* [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3)
* [IDEFICS3_ROCO](https://huggingface.co/eltorio/IDEFICS3_ROCO)

### Contribution Guide

1. **Technical Requirements**
   * Access to powerful GPU (T4, V100, A100 or equivalent)
   * Python environment with PyTorch
   * Disk space: ~100GB

2. **Getting Started**
   * Fork the repository
   * Resume from checkpoint 2350/12267 (in branch bug-restart)
   * Follow instructions in [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

3. **Contact**
   * For questions: [link to issues/discussions](https://huggingface.co/eltorio/IDEFICS3_ROCO/discussions)

### Docker Image

A AI training docker image is available for this model. The image and includes all necessary dependencies to run the fine-tuning process. The image is available on Docker Hub:  

```bash
docker run --user=42420:42420 -it sctg/roco-idefics3:latest /start.sh hf_TOKEN
```

The Dockerfile is available in the [IDEFICS_ROCO repository](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/Dockerfile).

### Acknowledgments

This work was made possible by the [Hugging Face Transformers](https://huggingface.co/) library and the [ROCO-radiology dataset](https://huggingface.co/datasets/eltorio/ROCO-radiology).