File size: 1,310 Bytes
8567f95
 
 
 
 
 
 
 
 
 
 
e4e42e3
8567f95
 
e4e42e3
8567f95
 
 
 
 
 
 
 
 
e090ac3
8567f95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
license: mit
---
# Official ICC model

The official checkpoint of ICC model, introduced in [ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation](https://arxiv.org/abs/2403.01306)

[Project Page](https://moranyanuka.github.io/icc/)

## Usage

The ICC model is used to quantify the concreteness of image captions (and sentences in general).


### Running the model

<details>
<summary> Click to expand </summary>

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("moranyanuka/icc")
model = AutoModelForSequenceClassification.from_pretrained("moranyanuka/icc").to("cuda")

captions = ["a great method of quantifying concreteness", "a man with a white shirt"]
text_ids = tokenizer(captions, padding=True, return_tensors="pt", truncation=True).to('cuda')
with torch.inference_mode():
  icc_scores = model(**text_ids)['logits']

# tensor([[0.0339], [1.0068]])
```
</details>



bibtex:
```
@misc{yanuka2024icc,
      title={ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation}, 
      author={Moran Yanuka and Morris Alper and Hadar Averbuch-Elor and Raja Giryes},
      year={2024},
      eprint={2403.01306},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
```