Spaces:

BAAI
/

EmbodiedVerse

Running

App Files Files Community

HelloGitHub commited on 2 days ago

Commit

a743d61

1 Parent(s): 8a0053d

update submit

Browse files

Files changed (2) hide show

app.py +42 -2
src/about.py +62 -43

app.py CHANGED Viewed

@@ -437,8 +437,48 @@ with demo:
             gr.HTML(TABLE_TEXT)
             gr.Markdown(LLM_BENCHMARKS_TEXT2, elem_classes="markdown-text")
         with gr.TabItem("📤 Submit here!", elem_id="submit-model-tab", id=2):
-            # 1. Submit your modelinfos here!
-            gr.Markdown("✨ Submit your modelinfos here!")
             with gr.Row():
                 model_name = gr.Textbox(label="Model Name")
                 revision_commit = gr.Textbox(label="Revision commit")

             gr.HTML(TABLE_TEXT)
             gr.Markdown(LLM_BENCHMARKS_TEXT2, elem_classes="markdown-text")
         with gr.TabItem("📤 Submit here!", elem_id="submit-model-tab", id=2):
+            with gr.Column():
+                with gr.Row():
+                    gr.Markdown(EVALUATION_QUEUE_TEXT, elem_classes="markdown-text")
+                with gr.Column():
+                    with gr.Accordion(
+                        f"✅ Finished Evaluations ({len(finished_eval_queue_df)})",
+                        open=False,
+                    ):
+                        with gr.Row():
+                            finished_eval_table = gr.components.Dataframe(
+                                value=finished_eval_queue_df,
+                                headers=EVAL_COLS,
+                                datatype=EVAL_TYPES,
+                                row_count=5,
+                            )
+                    with gr.Accordion(
+                        f"🔄 Running Evaluation Queue ({len(running_eval_queue_df)})",
+                        open=False,
+                    ):
+                        with gr.Row():
+                            running_eval_table = gr.components.Dataframe(
+                                value=running_eval_queue_df,
+                                headers=EVAL_COLS,
+                                datatype=EVAL_TYPES,
+                                row_count=5,
+                            )
+                    with gr.Accordion(
+                        f"⏳ Pending Evaluation Queue ({len(pending_eval_queue_df)})",
+                        open=False,
+                    ):
+                        with gr.Row():
+                            pending_eval_table = gr.components.Dataframe(
+                                value=pending_eval_queue_df,
+                                headers=EVAL_COLS,
+                                datatype=EVAL_TYPES,
+                                row_count=5,
+                            )
+            with gr.Row():
+                # 1. Submit your modelinfos here!
+                gr.Markdown("✨ Submit your modelinfos here!")
             with gr.Row():
                 model_name = gr.Textbox(label="Model Name")
                 revision_commit = gr.Textbox(label="Revision commit")

src/about.py CHANGED Viewed

@@ -360,38 +360,47 @@ Planning
 """
 EVALUATION_QUEUE_TEXT = """
-## Evaluation Queue for the FlagEval VLM Leaderboard
-Models added here will be automatically evaluated on the FlagEval cluster.
-Currently, we offer two methods for model evaluation, including API calls and private deployments:
-  1. If you choose to evaluate via API call, you need to provide the Model interface, name and corresponding API key.
-  2. If you choose to do open source model evaluation directly through huggingface, you don't need to fill in the Model online api url and Model online api key.
-## Open API model Integration Documentation
-For models accessed via API calls (such as OpenAI API, Anthropic API, etc.), the integration process is straightforward and only requires providing necessary configuration information.
-1. model_name: Name of the model to use
-2. api_key: API access key
-3. api_base: Base URL for the API service
-## Adding a Custom Model to the Platform
-This guide explains how to integrate your custom model into the platform by implementing a model adapter and run.sh script. We'll use the Qwen-VL implementation as a reference example.
-### Overview
-To add your custom model, you need to:
-1. Create a custom dataset class
-2. Implement a model adapter class
-3. Set up the initialization and inference pipeline
-### Step-by-Step Implementation
-Here is an example:[model_adapter.py](https://github.com/flageval-baai/FlagEvalMM/blob/main/model_zoo/vlm/qwen_vl/model_adapter.py)
-#### 1. Create Preprocess Custom Dataset Class
-Inherit from `ServerDataset` to handle data loading:
 ```python
 # model_adapter.py
 class CustomDataset(ServerDataset):
@@ -411,8 +420,9 @@ class CustomDataset(ServerDataset):
         return question_id, img_path_idx, qs
 ```
-The function `get_data` returns a structure like this:
-```python
 {
     "img_path": "A list where each element is an absolute path to an image that can be read directly using PIL, cv2, etc.",
     "question": "A string containing the question, where image positions are marked with <image1> <image2>",
@@ -421,11 +431,15 @@ The function `get_data` returns a structure like this:
 }
 ```
-#### 2. Implement Model Adapter
-Inherit from `BaseModelAdapter` and implement the required methods:
-1. model_init: is responsible for model initialization and serves as the entry point for model loading and setup.
-2. run_one_task: implements the inference pipeline, handling data processing and result generation for a single evaluation task.
 ```python
 # model_adapter.py
 class ModelAdapter(BaseModelAdapter):
@@ -461,30 +475,35 @@ class ModelAdapter(BaseModelAdapter):
         Use the provided meta_info and rank parameters to manage result storage as needed.
         '''
 ```
-Note:
-`results` is a list of dictionaries
-Each dictionary must contain two keys:
-```python
-question_id: identifies the specific question
-answer: contains the model's prediction/output
 ```
-After collecting all results, they are saved using `save_result()`
-#### 3.  Launch Script (run.sh)
-run.sh is the entry script for launching model evaluation, used to set environment variables and start the evaluation program.
-```python
 #!/bin/bash
 current_file="$0"
 current_dir="$(dirname "$current_file")"
 SERVER_IP=$1
 SERVER_PORT=$2
-PYTHONPATH=$current_dir:$PYTHONPATH python $current_dir/model_adapter.py \
-    --server_ip $SERVER_IP \
-    --server_port $SERVER_PORT \
-    "${@:3}"
 ```
 """
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"

 """
 EVALUATION_QUEUE_TEXT = """
+Submit here!功能部署内容
+Evaluation Queue for the FlagEval VLM Leaderboard
+Models added here will be automatically evaluated on the FlagEval cluster.
+Currently, we offer two methods for model evaluation, including API calls and private deployments:
+1. If you choose to evaluate via API call, you need to provide the Model interface, name and corresponding API key.
+2. If you choose to do open source model evaluation directly through huggingface, you don't need to fill in the Model online api url and Model online api key.
+## Open API model Integration Documentation
+For models accessed via API calls (such as OpenAI API, Anthropic API, etc.), the integration process is straightforward and only requires providing necessary configuration information.
+1. `model_name`: Name of the model to use
+2. `api_key`: API access key
+3. `api_base`: Base URL for the API service
+---
+## Adding a Custom Model to the Platform
+This guide explains how to integrate your custom model into the platform by implementing a model adapter and `run.sh` script. We'll use the Qwen-VL implementation as a reference example.
+### Overview
+To add your custom model, you need to:
+1. Create a custom dataset class
+2. Implement a model adapter class
+3. Set up the initialization and inference pipeline
+### Step-by-Step Implementation
+Here is an example: [Qwen-VL model_adapter.py](https://github.com/flageval-baai/FlagEvalMM/blob/main/model_zoo/vlm/qwen_vl/model_adapter.py)
+#### 1. Create Preprocess Custom Dataset Class
+Inherit from `ServerDataset` to handle data loading:
 ```python
 # model_adapter.py
 class CustomDataset(ServerDataset):
         return question_id, img_path_idx, qs
 ```
+The function `get_data` returns a structure like this:
+```json
 {
     "img_path": "A list where each element is an absolute path to an image that can be read directly using PIL, cv2, etc.",
     "question": "A string containing the question, where image positions are marked with <image1> <image2>",
 }
 ```
+---
+#### 2. Implement Model Adapter
+Inherit from `BaseModelAdapter` and implement the required methods:
+- `model_init`: is responsible for model initialization and serves as the entry point for model loading and setup.
+- `run_one_task`: implements the inference pipeline, handling data processing and result generation for a single evaluation task.
 ```python
 # model_adapter.py
 class ModelAdapter(BaseModelAdapter):
         Use the provided meta_info and rank parameters to manage result storage as needed.
         '''
 ```
+**Note:**
+`results` is a list of dictionaries.
+Each dictionary must contain two keys:
+```json
+{
+    "question_id": "identifies the specific question",
+    "answer": "contains the model's prediction/output"
+}
 ```
+After collecting all results, they are saved using `save_result()`.
+---
+#### 3. Launch Script (`run.sh`)
+`run.sh` is the entry script for launching model evaluation, used to set environment variables and start the evaluation program.
+```bash
 #!/bin/bash
 current_file="$0"
 current_dir="$(dirname "$current_file")"
 SERVER_IP=$1
 SERVER_PORT=$2
+PYTHONPATH=$current_dir:$PYTHONPATH python $current_dir/model_adapter.py     --server_ip $SERVER_IP     --server_port $SERVER_PORT     "${@:3}"
 ```
 """
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"