mattb512 commited on
Commit
a3a3fdb
β€’
2 Parent(s): 1db133b e71c8dc

Merge pull request #3 from TRI-ML/master

Browse files
Files changed (4) hide show
  1. .gitignore +2 -0
  2. README.md +34 -4
  3. interactive_demo.py +3 -3
  4. serve/gradio_web_server.py +5 -6
.gitignore CHANGED
@@ -103,6 +103,8 @@ celerybeat.pid
103
 
104
  # Logs
105
  serve_images/
 
 
106
 
107
  # Environments
108
  .env
 
103
 
104
  # Logs
105
  serve_images/
106
+ *conv.json
107
+ *controller.log*
108
 
109
  # Environments
110
  .env
README.md CHANGED
@@ -7,7 +7,8 @@ app_file: serve/gradio_web_server.py
7
 
8
  # VLM Demo
9
 
10
- > *VLM Demo*: Lightweight repo for chatting with models loaded into *VLM Bench*.
 
11
 
12
  ---
13
 
@@ -30,15 +31,21 @@ installed in the current environment. Installation instructions can be found
30
  The main script to run is `interactive_demo.py`, while the implementation of
31
  the Gradio Controller (`serve/gradio_controller.py`) and Gradio Web Server
32
  (`serve/gradio_web_server.py`) are within `serve`. All of this code is heavily
33
- adapted from the [LLaVA Github Repo:](https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/).
34
  More details on how this code was modified from the original LLaVA repo is provided in the
35
  relevant source files.
36
 
37
- To run the demo, run the following commands:
38
 
39
  + Start Gradio Controller: `python -m serve.controller --host 0.0.0.0 --port 10000`
40
  + Start Gradio Web Server: `python -m serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload --share`
41
- + Run interactive demo: `CUDA_VISIBLE_DEVICES=0 python -m interactive_demo --port 40000 --model_dir <PATH TO MODEL CKPT>`
 
 
 
 
 
 
42
 
43
  When running the demo, the following parameters are adjustable:
44
  + Temperature
@@ -55,6 +62,29 @@ prompt.
55
  + True/False Question Answering: Selecting this option is best when the user wants a True/False answer to a specific question provided in the
56
  prompt.
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  ## Contributing
60
 
 
7
 
8
  # VLM Demo
9
 
10
+ > *VLM Demo*: Lightweight repo for chatting with VLMs supported by our
11
+ [VLM Evaluation Suite](https://github.com/TRI-ML/vlm-evaluation/tree/main).
12
 
13
  ---
14
 
 
31
  The main script to run is `interactive_demo.py`, while the implementation of
32
  the Gradio Controller (`serve/gradio_controller.py`) and Gradio Web Server
33
  (`serve/gradio_web_server.py`) are within `serve`. All of this code is heavily
34
+ adapted from the [LLaVA Github Repo](https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/).
35
  More details on how this code was modified from the original LLaVA repo is provided in the
36
  relevant source files.
37
 
38
+ To run the demo, first run the following commands in separate terminals:
39
 
40
  + Start Gradio Controller: `python -m serve.controller --host 0.0.0.0 --port 10000`
41
  + Start Gradio Web Server: `python -m serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload --share`
42
+
43
+ To run the interactive demo, you can specify a model to chat with via a `model_dir` or `model_id` as follows
44
+
45
+ + `python -m interactive_demo --port 40000 --model_id <MODEL_ID>` OR
46
+ + `python -m interactive_demo --port 40000 --model_dir <MODEL_DIR>`
47
+
48
+ If you want to chat with multiple models simultaneously, you can launch the `interactive_demo` script in different terminals.
49
 
50
  When running the demo, the following parameters are adjustable:
51
  + Temperature
 
62
  + True/False Question Answering: Selecting this option is best when the user wants a True/False answer to a specific question provided in the
63
  prompt.
64
 
65
+ ## Example
66
+
67
+ To chat with the LLaVa 1.5 (7B) and Prism 7B models in an interactive GUI, run the following scripts in separate terminals.
68
+
69
+ Launch gradio controller:
70
+
71
+ `python -m serve.controller --host 0.0.0.0 --port 10000`
72
+
73
+ Launch web server:
74
+
75
+ `python -m serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload --share`
76
+
77
+ Now we can launch an interactive demo corresponding to each of the models we want to chat with. For Prism models, you
78
+ onl need to specify a `model_id`, while for LLaVA and InstructBLIP, you need to additionally specifiy a `model_family`
79
+ and `model_dir`. Note that for each model, a different port must be specified.
80
+
81
+ Launch interactive demo for Prism 7B Model:
82
+
83
+ `python -m interactive_demo --port 40000 --model_id prism-dinosiglip+7b`
84
+
85
+ Launch interactive demo for LLaVA 1.5 7B Model:
86
+
87
+ `python -m interactive_demo --port 40001 --model_family llava-v15 --model_id llava-v1.5-7b --model_dir liuhaotian/llava-v1.5-7b`
88
 
89
  ## Contributing
90
 
interactive_demo.py CHANGED
@@ -1,7 +1,7 @@
1
  """
2
  interactive_demo.py
3
 
4
- Entry point for all VLM-Bench interactive demos; specify model and get a gradio UI where you can chat with it!
5
 
6
  This file is heavily adapted from the script used to serve models in the LLaVa repo:
7
  https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/model_worker.py. It is
@@ -30,8 +30,8 @@ from llava.mm_utils import load_image_from_base64
30
  from llava.utils import server_error_msg
31
  from torchvision.transforms import Compose
32
 
33
- from vlbench.models import load_vlm
34
- from vlbench.overwatch import initialize_overwatch
35
  from serve import INTERACTION_MODES_MAP, MODEL_ID_TO_NAME
36
 
37
  GB = 1 << 30
 
1
  """
2
  interactive_demo.py
3
 
4
+ Entry point for all VLM-Evaluation interactive demos; specify model and get a gradio UI where you can chat with it!
5
 
6
  This file is heavily adapted from the script used to serve models in the LLaVa repo:
7
  https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/model_worker.py. It is
 
30
  from llava.utils import server_error_msg
31
  from torchvision.transforms import Compose
32
 
33
+ from vlm_eval.models import load_vlm
34
+ from vlm_eval.overwatch import initialize_overwatch
35
  from serve import INTERACTION_MODES_MAP, MODEL_ID_TO_NAME
36
 
37
  GB = 1 << 30
serve/gradio_web_server.py CHANGED
@@ -1,7 +1,7 @@
1
  """
2
  gradio_web_server.py
3
 
4
- Entry point for all VLM-Bench interactive demos; specify model and get a gradio UI where you can chat with it!
5
 
6
  This file is copied from the script used to define the gradio web server in the LLaVa codebase:
7
  https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/gradio_web_server.py with only very minor
@@ -244,9 +244,9 @@ def http_bot(state, model_selector, interaction_mode, temperature, max_new_token
244
 
245
  title_markdown = """
246
  # Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
247
- [[Project Page](TODO)] [[Code](TODO)]
248
- [[Models](TODO)]
249
- | πŸ“š [[Paper](TODO)]
250
  """
251
 
252
  tos_markdown = """
@@ -254,8 +254,7 @@ tos_markdown = """
254
  By using this service, users are required to agree to the following terms:
255
  The service is a research preview intended for non-commercial use only. It only provides limited safety measures and may
256
  generate offensive content. It must not be used for any illegal, harmful, violent, racist, or sexual purposes. The
257
- service may collect user dialogue data for future research. Please click the "Flag" button if you get any
258
- inappropriate answer! We will collect those to keep improving our moderator. For an optimal experience,
259
  please use desktop computers for this demo, as mobile devices may compromise its quality. This website
260
  is heavily inspired by the website released by [LLaVA](https://github.com/haotian-liu/LLaVA).
261
  """
 
1
  """
2
  gradio_web_server.py
3
 
4
+ Entry point for all VLM-Evaluation interactive demos; specify model and get a gradio UI where you can chat with it!
5
 
6
  This file is copied from the script used to define the gradio web server in the LLaVa codebase:
7
  https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/gradio_web_server.py with only very minor
 
244
 
245
  title_markdown = """
246
  # Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
247
+ [[[Training Code](github.com/TRI-ML/prismatic-vlms)]
248
+ [[[Evaluation Code](github.com/TRI-ML/vlm-evaluation)]
249
+ | πŸ“š [[Paper](https://arxiv.org/abs/2402.07865)]
250
  """
251
 
252
  tos_markdown = """
 
254
  By using this service, users are required to agree to the following terms:
255
  The service is a research preview intended for non-commercial use only. It only provides limited safety measures and may
256
  generate offensive content. It must not be used for any illegal, harmful, violent, racist, or sexual purposes. The
257
+ service may collect user dialogue data for future research. For an optimal experience,
 
258
  please use desktop computers for this demo, as mobile devices may compromise its quality. This website
259
  is heavily inspired by the website released by [LLaVA](https://github.com/haotian-liu/LLaVA).
260
  """