chanicpanic
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,13 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
license:
|
4 |
datasets:
|
5 |
- grascii/gregg-preanniversary-words
|
6 |
pipeline_tag: image-to-text
|
|
|
|
|
|
|
|
|
7 |
---
|
8 |
|
9 |
# Gregg Vision v0.2.1
|
@@ -11,8 +15,8 @@ pipeline_tag: image-to-text
|
|
11 |
Gregg Vision v0.2.1 generates a [Grascii](https://github.com/grascii/grascii) representation of a Gregg Shorthand form.
|
12 |
|
13 |
- **Model type:** Vision Encoder Text Decoder
|
14 |
-
- **License:**
|
15 |
-
- **Repository:** [
|
16 |
- **Demo:** [Grascii Search Space](https://huggingface.co/spaces/grascii/search)
|
17 |
|
18 |
## Uses
|
@@ -25,7 +29,37 @@ one can obtain possible English interpretations of the shorthand form.
|
|
25 |
|
26 |
Use the code below to get started with the model.
|
27 |
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Technical Details
|
31 |
|
@@ -43,4 +77,4 @@ Gregg Vision v0.2.1 was trained on the [gregg-preanniversary-words](https://hugg
|
|
43 |
|
44 |
### Training Hardware
|
45 |
|
46 |
-
Gregg Vision v0.2.1 was trained using 1xT4.
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
license: mit
|
4 |
datasets:
|
5 |
- grascii/gregg-preanniversary-words
|
6 |
pipeline_tag: image-to-text
|
7 |
+
tags:
|
8 |
+
- gregg
|
9 |
+
- shorthand
|
10 |
+
- stenography
|
11 |
---
|
12 |
|
13 |
# Gregg Vision v0.2.1
|
|
|
15 |
Gregg Vision v0.2.1 generates a [Grascii](https://github.com/grascii/grascii) representation of a Gregg Shorthand form.
|
16 |
|
17 |
- **Model type:** Vision Encoder Text Decoder
|
18 |
+
- **License:** MIT
|
19 |
+
- **Repository:** [Github](https://github.com/grascii/gregg-vision-v0.2.1)
|
20 |
- **Demo:** [Grascii Search Space](https://huggingface.co/spaces/grascii/search)
|
21 |
|
22 |
## Uses
|
|
|
29 |
|
30 |
Use the code below to get started with the model.
|
31 |
|
32 |
+
```python
|
33 |
+
from transformers import AutoModelForVision2Seq, AutoImageProcessor, AutoTokenizer
|
34 |
+
from PIL import Image
|
35 |
+
import numpy as np
|
36 |
+
|
37 |
+
|
38 |
+
model_id = "grascii/gregg-vision-v0.2.1"
|
39 |
+
model = AutoModelForVision2Seq.from_pretrained(model_id)
|
40 |
+
processor = AutoImageProcessor.from_pretrained(model_id)
|
41 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
42 |
+
|
43 |
+
|
44 |
+
def generate_grascii(image: Image):
|
45 |
+
# convert image to a single channel
|
46 |
+
grayscale = image.convert("L")
|
47 |
+
|
48 |
+
# prepare processor input
|
49 |
+
images = np.array([grayscale])
|
50 |
+
|
51 |
+
# preprocess image
|
52 |
+
pixel_values = processor(images, return_tensors="pt").pixel_values
|
53 |
+
|
54 |
+
# generate token ids
|
55 |
+
ids = model.generate(pixel_values, max_new_tokens=12)[0]
|
56 |
+
|
57 |
+
# decode ids and return grascii
|
58 |
+
return tokenizer.decode(ids, skip_special_tokens=True)
|
59 |
+
```
|
60 |
+
|
61 |
+
Note: As of `transformers` v4.47.0, the model is incompatible with `pipeline` due to the
|
62 |
+
model's single channel image input.
|
63 |
|
64 |
## Technical Details
|
65 |
|
|
|
77 |
|
78 |
### Training Hardware
|
79 |
|
80 |
+
Gregg Vision v0.2.1 was trained using 1xT4.
|