fullstack's picture
Upload folder using huggingface_hub
5e62489 verified
|
raw
history blame
4.26 kB
# GPQA Generator: Fine-tuned Gemma 2B for Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset
## Model Details
- **Model Type:** Language Model
- **Base Model:** unsloth/gemma-2-2b-bnb-4bit
- **Fine-tuned by:** [Your Organization/Name]
- **License:** [Specify the license]
This model is a fine-tuned version of the Gemma 2B base model, specifically tailored to Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset. It produces graduate-level, context-rich multiple-choice questions along with one correct answer, three incorrect answers, and an explanation.
## Intended Use
This model is designed for educational content creators, assessment developers, and researchers who need to generate complex, Google-proof multiple-choice questions across various academic disciplines.
### Primary Use Cases:
- Generating challenging assessment questions for advanced students
- Creating content for educational platforms and applications
- Assisting in the development of standardized tests
- Supporting research in question generation and educational assessment
## How to Use
### Setting Up
1. Clone the repository containing the model and scripts.
2. Ensure you have the required dependencies installed (httpx, transformers, etc.).
### Running the Model
1. Start the vLLM server:
```
./run_vllm_2b.sh
```
2. Generate questions using the `generate.py` script:
For a single category:
```
python generate.py --category "Your Category" --depth 4
```
To use predefined categories:
```
python generate.py --use-array --depth 4
```
### Configuration
- Modify the `CATEGORIES_TO_PROCESS` list in the script to add or change predefined categories.
- Adjust the `max_depth` parameter to control the depth of subcategory exploration.
- The script uses multi-threading for efficient processing. Adjust `num_threads` in `process_categories()` if needed.
## Sample Output
Here's an example of the generated output:
```json
{
"question": "A developer is working on a large project that uses Mercurial version control. They need to merge a branch containing bug fixes from another team. What is the recommended approach to avoid merging conflicts?",
"answer": "Create a new branch from the source directory.",
"incorrect_answer_1": "Merge the branches directly.",
"incorrect_answer_2": "Skip the merge process altogether.",
"incorrect_answer_3": "Use a third-party tool like Git.",
"explanation": "Merging branches in Mercurial requires careful consideration to avoid conflicts. Here's a breakdown of the reasoning: \n1. Branch Creation: Creating a new branch allows the developer to isolate the changes from the other team without affecting the base branch.\n2. Conflict Detection: Comparing the histories of the branches helps identify potential conflicts that may arise during the merge.\n3. Conflict Resolution: Manual conflict resolution is essential to ensure the merge is successful. Mercurial provides tools like \"diff\" and \"merge\" commands for this purpose.\n4. Committing Changes: Once the merge is complete, the developer should commit the changes to their new branch.",
"subcategories": ["Version Control", "Mercurial", "Merge Conflicts"],
"category": "Mercurial",
"depth": 0
}
```
## Limitations
- The model generates questions based on its training data, which may not always reflect the most current information in rapidly evolving fields.
- While designed to be "Google-proof," the effectiveness may vary depending on the specific topic and how information is presented online.
- The quality and accuracy of generated questions should be reviewed by subject matter experts before use in formal assessments.
## Ethical Considerations
- Users should be aware of potential biases in the generated content and review questions for fairness and inclusivity.
- The model should not be used to generate misleading or factually incorrect information.
- Respect copyright and intellectual property rights when using generated content.
## Citation
If you use this model in your research or applications, please cite it as follows:
```
[Citation information to be added]
```
## Contact
For questions, feedback, or support, please contact [Your Contact Information].