bytetriper
/

vit-mae-r

Inference Endpoints

Model card Files Files and versions Community

bytetriper commited on Jun 11

Commit

a6e101e

•

1 Parent(s): d798e90

Update README.md

Files changed (1) hide show

README.md +57 -3

README.md CHANGED Viewed

@@ -1,3 +1,57 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: image-to-image
+---
+# Model Card for Model ID
+VIT-MAE-r is a fine-tuned version of MAE for image reconstuction. We release a version fine-tuned from [MAE-Large](https://huggingface.co/facebook/vit-mae-large)
+## Model Details
+VIT-MAE-r is already converted to hf format and should be able to be used directly by `from_pretrained` method.
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [LM4LV: A Frozen Large Language Model for Low-level Vision Tasks](https://arxiv.org/abs/2405.15734v1)
+- **source model**: [MAE-Large](https://huggingface.co/facebook/vit-mae-large)
+## How to Get Started with the Model
+Use the code below to get started with the model.
+``python
+from transformers import AutoImageProcessor, AutoModelForPreTraining
+model = AutoModelForPreTraining.from_pretrained("bytetriper/vit-mae-r")
+``
+## Evaluation
+This model achieves a rFID on ImageNet val set of 1.24, evaluated using the standard tensorflow tool provided by [Guided-Diffusion](https://github.com/openai/guided-diffusion/tree/main/evaluations)
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+@article{zheng2024lm4lv,
+  title={LM4LV: A Frozen Large Language Model for Low-level Vision Tasks},
+  author={Zheng, Boyang and Gu, Jinjin and Li, Shijun and Dong, Chao},
+  journal={arXiv preprint arXiv:2405.15734},
+  year={2024}
+}
+## Model Card Authors [optional]
+Boyang Zheng
+## Model Card Contact
+[email protected]