# GPQA Generator: Fine-tuned Gemma 2B for Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset 

## Model Details

- **Model Type:** Language Model
- **Base Model:** unsloth/gemma-2-2b-bnb-4bit
- **Fine-tuned by:** [Your Organization/Name]
- **License:** [Specify the license]

This model is a fine-tuned version of the Gemma 2B base model, specifically tailored to Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset. It produces graduate-level, context-rich multiple-choice questions along with one correct answer, three incorrect answers, and an explanation.

## Intended Use

This model is designed for educational content creators, assessment developers, and researchers who need to generate complex, Google-proof multiple-choice questions across various academic disciplines.

### Primary Use Cases:

- Generating challenging assessment questions for advanced students
- Creating content for educational platforms and applications
- Assisting in the development of standardized tests
- Supporting research in question generation and educational assessment

## How to Use

### Setting Up

1. Clone the repository containing the model and scripts.
2. Ensure you have the required dependencies installed (httpx, transformers, etc.).

### Running the Model

1. Start the vLLM server:
   ```
   ./run_vllm_2b.sh
   ```

2. Generate questions using the `generate.py` script:
   
   For a single category:
   ```
   python generate.py --category "Your Category" --depth 4
   ```

   To use predefined categories:
   ```
   python generate.py --use-array --depth 4
   ```

### Configuration

- Modify the `CATEGORIES_TO_PROCESS` list in the script to add or change predefined categories.
- Adjust the `max_depth` parameter to control the depth of subcategory exploration.
- The script uses multi-threading for efficient processing. Adjust `num_threads` in `process_categories()` if needed.

## Sample Output

Here's an example of the generated output:

```json
{
  "question": "A developer is working on a large project that uses Mercurial version control. They need to merge a branch containing bug fixes from another team. What is the recommended approach to avoid merging conflicts?",
  "answer": "Create a new branch from the source directory.",
  "incorrect_answer_1": "Merge the branches directly.",
  "incorrect_answer_2": "Skip the merge process altogether.",
  "incorrect_answer_3": "Use a third-party tool like Git.",
  "explanation": "Merging branches in Mercurial requires careful consideration to avoid conflicts. Here's a breakdown of the reasoning: \n1. Branch Creation: Creating a new branch allows the developer to isolate the changes from the other team without affecting the base branch.\n2. Conflict Detection: Comparing the histories of the branches helps identify potential conflicts that may arise during the merge.\n3. Conflict Resolution: Manual conflict resolution is essential to ensure the merge is successful. Mercurial provides tools like \"diff\" and \"merge\" commands for this purpose.\n4. Committing Changes: Once the merge is complete, the developer should commit the changes to their new branch.",
  "subcategories": ["Version Control", "Mercurial", "Merge Conflicts"],
  "category": "Mercurial",
  "depth": 0
}
```

## Limitations

- The model generates questions based on its training data, which may not always reflect the most current information in rapidly evolving fields.
- While designed to be "Google-proof," the effectiveness may vary depending on the specific topic and how information is presented online.
- The quality and accuracy of generated questions should be reviewed by subject matter experts before use in formal assessments.

## Ethical Considerations

- Users should be aware of potential biases in the generated content and review questions for fairness and inclusivity.
- The model should not be used to generate misleading or factually incorrect information.
- Respect copyright and intellectual property rights when using generated content.

## Citation

If you use this model in your research or applications, please cite it as follows:

```
[Citation information to be added]
```

## Contact

For questions, feedback, or support, please contact [Your Contact Information].