|
# Quizbowl Agent Web Interface Walkthrough |
|
|
|
This walkthrough guide will help you create, test, and submit your quizbowl agent for both tossup and bonus questions using our web interface. |
|
|
|
## Overview |
|
|
|
Our web interface allows you to: |
|
- Create and import pipeline workflows |
|
- Configure tossup and bonus agents |
|
- Test agents on sample questions |
|
- Visualize agent performance |
|
- Export your pipeline configurations to yaml files. |
|
- Submit your agents to our competition for full evaluation. |
|
|
|
## Creating a Tossup Agent |
|
|
|
1. Navigate to the "Tossup Agents" tab at the top of the interface. |
|
|
|
2. **Creating a Pipeline**: |
|
- Note the input variable `question_text` and output variables `answer`, `confidence` required for tossup agents. |
|
- The default setup includes a single agent step labeled "A: Tossup Agent". |
|
- You can add more steps using the "+ Add Step" button for multi-step pipelines. |
|
|
|
3. **Configuring Your Agent**: |
|
- Select your preferred model from the dropdown (e.g., `OpenAI/gpt-4o-mini`). |
|
- Adjust the temperature slider (higher values = more creativity, lower = more deterministic). |
|
- Click on the "System Prompt" tab to customize your agent's instructions. |
|
- Your system prompt is crucial - it tells the LLM how to interpret questions and format answers. |
|
|
|
4. **Managing Input and Output Variables**: |
|
|
|
 |
|
- Click the "Inputs" tab to view and modify input variables: |
|
- Each input has a "Variable Used" (how it's referenced in the pipeline) |
|
- "Input Name" (what the model sees) |
|
- "Description" (explains the input's purpose) |
|
- Use the "+" button to add a new input variable |
|
- Use the "×" button to remove an input variable |
|
|
|
 |
|
- Click the "Outputs" tab to manage output variables: |
|
- Each output has an "Output Field" name |
|
- "Type" dropdown (str, float, etc.) to define the data type |
|
- "Description" explaining the output's purpose |
|
- Use arrow buttons to change output order |
|
- Use the "+" button to add a new output |
|
- Use the "×" button to remove an output |
|
|
|
- When adding a new output, consider the following types: |
|
- `str`: For text outputs like answers |
|
- `float`: For numerical outputs like confidence scores (0.0-1.0) |
|
- `bool`: For true/false outputs |
|
- `list`: For array outputs like multiple answer candidates |
|
|
|
5. **Buzzer Settings**: |
|
|
|
 |
|
- Scroll down to the "Buzzer settings" section. |
|
- Set your confidence threshold (e.g., 0.85) - your agent will only buzz when its confidence exceeds this value. |
|
- Choose a method (AND/OR) if combining multiple conditions. |
|
- Adjust probability settings if needed. |
|
|
|
6. **Testing Your Agent**: |
|
- Enter a Question ID or use the provided sample question. |
|
- Check "Early Stop" if you want to stop processing once the agent buzzes. |
|
- Click "Run on Tossup Question" to test your agent. |
|
- Review the answer and click "Buzz Confidence" to see confidence metrics. |
|
|
|
7. **Understanding Tossup Results Visualization**: |
|
|
|
 |
|
|
|
After running a tossup question, you'll see the results displayed in several ways: |
|
- **Highlighted Question Text**: |
|
- Key terms are highlighted throughout the question, where the agent was evaluated. |
|
- Click any highlighted word to see an answer popup with the confidence at that point. |
|
- Buzz point appears in green / red (e.g., "Eckstein's") based on whether the agent was correct or not. |
|
- **Answer Popup**: |
|
- Displays final answer, confidence score, and correctness indicator |
|
- Appears when hovering over buzzpoints or hovering over highlighted terms |
|
|
|
- **Buzz Confidence Graph**: |
|
- X-axis: token position; Y-axis: confidence (0.0-1.0) |
|
- Blue line shows confidence progression |
|
- Amber vertical line marks buzz point |
|
- Dashed horizontal line shows confidence threshold |
|
|
|
This visualization helps evaluate: |
|
- Most informative clues |
|
- Confidence threshold calibration |
|
- Whether agent should be more aggressive or conservative |
|
|
|
8. **Exporting Your Pipeline**: |
|
- Once you're satisfied with your agent, click "Export Pipeline" to save your configuration. |
|
- Click the "Pipeline Preview" dropdown to see the YAML configuration. |
|
- You can download this configuration for future use or submission. |
|
|
|
## Creating a Bonus Agent |
|
|
|
1. Navigate to the "Bonus Round Agents" tab at the top of the interface. |
|
|
|
2. **Creating a Pipeline**: |
|
- Note the input variables `leadin, part` and output variables `answer, confidence, explanation` required for bonus agents. |
|
- The default setup includes a single agent step labeled "A: Bonus Agent". |
|
|
|
3. **Configuring Your Agent**: |
|
- Select your preferred model from the dropdown. |
|
- Adjust the temperature slider. |
|
- Customize the system prompt to provide instructions for handling bonus questions. |
|
- Include clear formatting expectations for answer, confidence, and explanation. |
|
|
|
4. **Testing Your Agent**: |
|
- Enter a bonus question with leadin and part text. |
|
- Click the appropriate run button to test your agent. |
|
- Review the agent's answer, confidence, and explanation. |
|
|
|
5. **Exporting Your Pipeline**: |
|
- Click "Export Pipeline" to save your configuration. |
|
|
|
## Advanced Features |
|
|
|
### Multi-step Pipelines |
|
|
|
1. **Adding Steps**: |
|
- Click "+ Add Step" to add additional processing steps. |
|
- Each step can use different models or system prompts. |
|
- Use steps for different tasks like analysis, answer generation, and confidence calculation. |
|
|
|
2. **Variable Mapping**: |
|
- Connect outputs from earlier steps to inputs for later steps. |
|
- The final output variables must map to your defined outputs (answer, confidence, etc.). |
|
|
|
### Importing Existing Pipelines |
|
|
|
1. Click the "Select Pipeline to Import..." dropdown. |
|
2. Choose an existing pipeline to load its configuration. |
|
3. Click "Import Pipeline" to load it into the interface. |
|
4. Modify as needed for your specific use case. |
|
|
|
For a detailed example of importing and modifying a sophisticated multi-step pipeline, see our [Advanced Pipeline Examples](./advanced-pipeline-examples.md) guide, which walks through enhancing the two-step justified confidence model. |
|
|
|
## Submitting Your Agent |
|
|
|
1. **Evaluate Your Agent**: |
|
- Before submission, click "Evaluate" to run a thorough assessment. |
|
- This helps identify potential issues before formal submission. |
|
|
|
2. **Model Submission**: |
|
- Fill in the "Model Name" and "Description" fields with appropriate information. |
|
- Click "Sign in with Hugging Face" to authenticate. |
|
- Click "Submit" to submit your agent for official evaluation. |
|
|
|
## Best Practices |
|
|
|
1. **System Prompts**: |
|
- Be specific about output formats in your system prompts. |
|
- For tossups, instruct the model to provide confidence scores. |
|
- For bonuses, instruct the model to explain its reasoning. |
|
|
|
2. **Confidence Calibration**: |
|
- Fine-tune your buzzer threshold based on testing. |
|
- Too high: agent might miss answerable questions. |
|
- Too low: agent might buzz incorrectly. |
|
|
|
3. **Testing Thoroughly**: |
|
- Test your agent on various question types and difficulties. |
|
- Check performance on both early and late clues for tossups. |
|
|
|
Good luck with your quizbowl agent submission! |