grascii
/

gregg-vision-v0.2.1

vision-encoder-decoder

image-text-to-text

Inference Endpoints

Model card Files Files and versions Community

chanicpanic commited on Dec 28, 2024

Commit

72de723

·

verified ·

1 Parent(s): 9efb814

Update README.md

Files changed (1) hide show

README.md +39 -5

README.md CHANGED Viewed

@@ -1,9 +1,13 @@
 ---
 library_name: transformers
-license: apache-2.0
 datasets:
 - grascii/gregg-preanniversary-words
 pipeline_tag: image-to-text
 ---
 # Gregg Vision v0.2.1
@@ -11,8 +15,8 @@ pipeline_tag: image-to-text
 Gregg Vision v0.2.1 generates a [Grascii](https://github.com/grascii/grascii) representation of a Gregg Shorthand form.
 - **Model type:** Vision Encoder Text Decoder
-- **License:** Apache 2.0
-- **Repository:** [More Information Needed]
 - **Demo:** [Grascii Search Space](https://huggingface.co/spaces/grascii/search)
 ## Uses
@@ -25,7 +29,37 @@ one can obtain possible English interpretations of the shorthand form.
 Use the code below to get started with the model.
-[More Information Needed]
 ## Technical Details
@@ -43,4 +77,4 @@ Gregg Vision v0.2.1 was trained on the [gregg-preanniversary-words](https://hugg
 ### Training Hardware
-Gregg Vision v0.2.1 was trained using 1xT4.

 ---
 library_name: transformers
+license: mit
 datasets:
 - grascii/gregg-preanniversary-words
 pipeline_tag: image-to-text
+tags:
+- gregg
+- shorthand
+- stenography
 ---
 # Gregg Vision v0.2.1
 Gregg Vision v0.2.1 generates a [Grascii](https://github.com/grascii/grascii) representation of a Gregg Shorthand form.
 - **Model type:** Vision Encoder Text Decoder
+- **License:** MIT
+- **Repository:** [Github](https://github.com/grascii/gregg-vision-v0.2.1)
 - **Demo:** [Grascii Search Space](https://huggingface.co/spaces/grascii/search)
 ## Uses
 Use the code below to get started with the model.
+```python
+from transformers import AutoModelForVision2Seq, AutoImageProcessor, AutoTokenizer
+from PIL import Image
+import numpy as np
+model_id = "grascii/gregg-vision-v0.2.1"
+model = AutoModelForVision2Seq.from_pretrained(model_id)
+processor = AutoImageProcessor.from_pretrained(model_id)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+def generate_grascii(image: Image):
+  # convert image to a single channel
+  grayscale = image.convert("L")
+  # prepare processor input
+  images = np.array([grayscale])
+  # preprocess image
+  pixel_values = processor(images, return_tensors="pt").pixel_values
+  # generate token ids
+  ids = model.generate(pixel_values, max_new_tokens=12)[0]
+  # decode ids and return grascii
+  return tokenizer.decode(ids, skip_special_tokens=True)
+```
+Note: As of `transformers` v4.47.0, the model is incompatible with `pipeline` due to the
+model's single channel image input.
 ## Technical Details
 ### Training Hardware
+Gregg Vision v0.2.1 was trained using 1xT4.