Spaces:

egmaminta
/

indoor-scene-recognition-to-speech

Runtime error

App Files Files Community

egmaminta commited on Mar 23, 2022

Commit

da92d66

1 Parent(s): 51d2f96

Update app.py

Browse files

Files changed (1) hide show

app.py +3 -3

app.py CHANGED Viewed

@@ -52,10 +52,10 @@ gradio.Interface(fn=classify,
                  layout='horizontal',
                  title='Indoor Scene Recognition',
                  description='A smart and easy-to-use indoor scene classifier. Start by uploading an input image of an indoor scene. The outputs are the top five indoor scene classes that best describe your input image.',
-                 article='''<h2>Additional Information</h2><p style='text-align: justify'>This indoor scene classifier employs the <b><a href='https://huggingface.co/google/vit-base-patch16-224-in21k' target='_blank'>google/vit-base-patch16-224-in21k</a></b>, a <b>Visual Transformer (ViT)</b> model pre-trained on the <b><a href='https://github.com/Alibaba-MIIL/ImageNet21K' target='_blank'>ImageNet-21k</a></b> (14 million images, 21,843 classes) at a resolution of 224 pixels by 224 pixels and was first introduced in the paper <b><a href='https://arxiv.org/abs/2010.11929' target='_blank'>An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale</a></b> by Dosovitskiy et al. It was then fine-tuned on the <b><a href='https://www.kaggle.com/itsahmad/indoor-scenes-cvpr-2019' target='_blank'>MIT Indoor Scenes</a></b> data set from Kaggle. The source model used in this space is from <b><a href='https://huggingface.co/vincentclaes/mit-indoor-scenes' target='_blank'>vincentclaes/mit-indoor-scenes</a></b>.</p>
-                 <p style='text-align: justify'>For further research on the Visual Transformer, the original GitHub repository is found in <b><a href='https://github.com/google-research/vision_transformer' target='_blank'>this link</a></b>.</p>
                  <h2>Disclaimer</h2>
-                 <p style='text-align: justify'>The team releasing the Visual Transformer did not write a model card for it via Hugging Face. Hence, the Visual Transformer model card released in the Hugging Face Models library has been written by the Hugging Face team.</p>
                  <h2>Limitations</h2>
                  <p style='text-align: justify'>The model was trained only on 67 classes (indoor scenes). Hence, the model should perform better if the input indoor scene image belongs to one of the target classes it was trained on.</p>''',
                  allow_flagging='never').launch()

                  layout='horizontal',
                  title='Indoor Scene Recognition',
                  description='A smart and easy-to-use indoor scene classifier. Start by uploading an input image of an indoor scene. The outputs are the top five indoor scene classes that best describe your input image.',
+                 article='''<h2>Additional Information</h2><p style='text-align: justify'>This indoor scene classifier employs the <b><a href='https://huggingface.co/google/vit-base-patch16-224-in21k' target='_blank'>google/vit-base-patch16-224-in21k</a></b>, a <b>Vision Transformer (ViT)</b> model pre-trained on the <b><a href='https://github.com/Alibaba-MIIL/ImageNet21K' target='_blank'>ImageNet-21k</a></b> (14 million images, 21,843 classes) at a resolution of 224 pixels by 224 pixels and was first introduced in the paper <b><a href='https://arxiv.org/abs/2010.11929' target='_blank'>An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale</a></b> by Dosovitskiy et al. It was then fine-tuned on the <b><a href='https://www.kaggle.com/itsahmad/indoor-scenes-cvpr-2019' target='_blank'>MIT Indoor Scenes</a></b> data set from Kaggle. The source model used in this space is from <b><a href='https://huggingface.co/vincentclaes/mit-indoor-scenes' target='_blank'>vincentclaes/mit-indoor-scenes</a></b>.</p>
+                 <p style='text-align: justify'>For further research on the Vision Transformer, the original GitHub repository is found in <b><a href='https://github.com/google-research/vision_transformer' target='_blank'>this link</a></b>.</p>
                  <h2>Disclaimer</h2>
+                 <p style='text-align: justify'>The team releasing the Vision Transformer did not write a model card for it via Hugging Face. Hence, the Vision Transformer model card released in the Hugging Face Models library has been written by the Hugging Face team.</p>
                  <h2>Limitations</h2>
                  <p style='text-align: justify'>The model was trained only on 67 classes (indoor scenes). Hence, the model should perform better if the input indoor scene image belongs to one of the target classes it was trained on.</p>''',
                  allow_flagging='never').launch()