roborovski commited on
Commit
434bfb2
1 Parent(s): d1b25b9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +11 -13
README.md CHANGED
@@ -3,17 +3,14 @@ language:
3
  - en
4
  ---
5
 
6
- # 🌔 moondream1
7
 
8
- 1.6B parameter model built by [@vikhyatk](https://x.com/vikhyatk) using SigLIP, Phi-1.5 and the LLaVa training dataset.
9
- The model is release for research purposes only, commercial use is not allowed.
10
-
11
- Try it out on [Huggingface Spaces](https://huggingface.co/spaces/vikhyatk/moondream1)!
12
 
13
  **Usage**
14
 
15
- ```
16
- pip install transformers timm einops
17
  ```
18
 
19
  ```python
@@ -35,10 +32,11 @@ print(tokenizer.decode(outputs[0]))
35
 
36
  ## Examples
37
 
38
- | Image | Examples |
 
 
39
  | --- | --- |
40
- | ![](assets/demo-1.jpg) | **What is the title of this book?**<br>The Little Book of Deep Learning<br><br>**What can you tell me about this book?**<br>The book in the image is titled "The Little Book of Deep Learning." It appears to be a guide or manual that provides information and instructions on the subject of deep learning. The book is described as being small and thin, making it easy to carry and read. It is set in a room with a view outside, suggesting that the person holding the book might be in a living or working space with a view. The content of the book is likely to be complex, as the title implies a focus on advanced concepts in deep learning.<br><br>**Where is the person standing?**<br> The person is standing on a balcony or deck, which is outside. |
41
- | ![](assets/demo-2.jpg) | **What type of food is the girl holding?**<br>The girl is holding a hamburger.<br><br>**What color is the woman's hair?**<br>The woman's hair is white.<br><br>**What is the girl doing in the image?**<br>The girl in the image is eating a hamburger. |
42
- | ![](assets/demo-3.jpg) | **What color is the train?**<br>The train is red.<br><br>**What is the bus doing in the image?**<br> The bus in the image is driving down a city street, passing through an intersection.<br><br>**What kind of public transportation is in the image?**<br>The image features a red train or trolley as the public transportation option. |
43
- | ![](assets/demo-4.jpg) | **What is the girl looking at?**<br>The girl is looking at her reflection in the mirror.<br><br>**What color is the woman's hair?**<br>The woman's hair is brown.<br><br>**What is the woman doing in the mirror?**<br>The woman is adjusting her hair in the mirror. |
44
- | ![](assets/demo-5.jpg) | **What might be the reason for the dog's aggressive behavior?**<br>The dog's aggressive behavior, including its growling, baring its teeth, and showing its sharp teeth, could be due to various reasons. It is possible that the dog is feeling threatened, scared, or defensive about something in its environment, such as another animal, a loud noise, or a person nearby. Alternatively, the dog might be reacting to a command it received from its owner, or it could be experiencing discomfort or pain, which is causing it to display aggressive behavior. It is essential to observe the dog's body language and context to determine the underlying cause of its aggressive behavior and respond accordingly to address its needs or concerns.<br><br>**What kind of dog is in the picture?**<br>The dog in the picture is a small, black and white chihuahua. |
 
3
  - en
4
  ---
5
 
6
+ # SuperPrompt
7
 
8
+ A T5 model fine-tuned on the SuperPrompt dataset for upsampling text prompts to more detailed descriptions. Meant to be used as a pre-generation step for text-to-image models that benefit from more detailed prompts.
 
 
 
9
 
10
  **Usage**
11
 
12
+ ```bash
13
+ pip install transformers
14
  ```
15
 
16
  ```python
 
32
 
33
  ## Examples
34
 
35
+ Sample prompts taken from the [DrawBench](https://imagen.research.google/) dataset.
36
+
37
+ | Prompt | Upsampled |
38
  | --- | --- |
39
+ | A storefront with 'Hello World' written on it. | The storefront of "Hello World" is painted in vibrant shades of orange and yellow, with the words "Hello World" written in bold, cursive letters across the front. The storefront is surrounded by a bustling cityscape, with skyscrapers and neon signs lining the walls. The storefront is surrounded by a sea of people, all gazing up at the sky with a sense of excitement and excitement.
40
+ | An old photograph of a 1920s airship shaped like a pig, floating over a wheat field. | In a surreal and whimsical scene, a vintage airplane shaped like a pig glides effortlessly over a vast wheat field. The pig's eyes are closed, and its tail swishes back and forth as it floats effortlessly over the landscape. The sun casts a warm glow on the scene, illuminating the intricate details of the pig's fur and the surrounding landscape. The airship's engines hum softly, a testament to the power of imagination and imagination.
41
+ | A painting by Grant Wood of an astronaut couple, american gothic style. | In this surreal masterpiece, Grant Wood depicts a lone astronaut, dressed in a sleek silver spacesuit, gazing out at the vast expanse of space. The astronaut's eyes are closed, and his hands are clasped tightly around his body. The scene is set against a backdrop of swirling, distorted clouds, reminiscent of the American gothic style. The painting is a true masterpiece, capturing the essence of the human spirit and the beauty of the universe.
42
+ | A sheep to the right of a wine glass. | A majestic white sheep with a wagging tail stands to the right of a sparkling wine glass, her long ears twitching as she gazes intently at the glass. The sun is setting in the background, casting a warm orange glow on the scene. The scene is set in a cozy living room, with a fireplace and a wooden table in the background.