Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GPQA Generator: Fine-tuned Gemma 2B for Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset

Model Details

  • Model Type: Language Model
  • Base Model: unsloth/gemma-2-2b-bnb-4bit
  • Fine-tuned by: [Your Organization/Name]
  • License: [Specify the license]

This model is a fine-tuned version of the Gemma 2B base model, specifically tailored to Google GPQA (Graduate-Level Google-Proof Q&A Benchmark) dataset. It produces graduate-level, context-rich multiple-choice questions along with one correct answer, three incorrect answers, and an explanation.

Intended Use

This model is designed for educational content creators, assessment developers, and researchers who need to generate complex, Google-proof multiple-choice questions across various academic disciplines.

Primary Use Cases:

  • Generating challenging assessment questions for advanced students
  • Creating content for educational platforms and applications
  • Assisting in the development of standardized tests
  • Supporting research in question generation and educational assessment

How to Use

Setting Up

  1. Clone the repository containing the model and scripts.
  2. Ensure you have the required dependencies installed (httpx, transformers, etc.).

Running the Model

  1. Start the vLLM server:

    ./run_vllm_2b.sh
    
  2. Generate questions using the generate.py script:

    For a single category:

    python generate.py --category "Your Category" --depth 4
    

    To use predefined categories:

    python generate.py --use-array --depth 4
    

Configuration

  • Modify the CATEGORIES_TO_PROCESS list in the script to add or change predefined categories.
  • Adjust the max_depth parameter to control the depth of subcategory exploration.
  • The script uses multi-threading for efficient processing. Adjust num_threads in process_categories() if needed.

Sample Output

Here's an example of the generated output:

{
  "question": "A developer is working on a large project that uses Mercurial version control. They need to merge a branch containing bug fixes from another team. What is the recommended approach to avoid merging conflicts?",
  "answer": "Create a new branch from the source directory.",
  "incorrect_answer_1": "Merge the branches directly.",
  "incorrect_answer_2": "Skip the merge process altogether.",
  "incorrect_answer_3": "Use a third-party tool like Git.",
  "explanation": "Merging branches in Mercurial requires careful consideration to avoid conflicts. Here's a breakdown of the reasoning: \n1. Branch Creation: Creating a new branch allows the developer to isolate the changes from the other team without affecting the base branch.\n2. Conflict Detection: Comparing the histories of the branches helps identify potential conflicts that may arise during the merge.\n3. Conflict Resolution: Manual conflict resolution is essential to ensure the merge is successful. Mercurial provides tools like \"diff\" and \"merge\" commands for this purpose.\n4. Committing Changes: Once the merge is complete, the developer should commit the changes to their new branch.",
  "subcategories": ["Version Control", "Mercurial", "Merge Conflicts"],
  "category": "Mercurial",
  "depth": 0
}

Limitations

  • The model generates questions based on its training data, which may not always reflect the most current information in rapidly evolving fields.
  • While designed to be "Google-proof," the effectiveness may vary depending on the specific topic and how information is presented online.
  • The quality and accuracy of generated questions should be reviewed by subject matter experts before use in formal assessments.

Ethical Considerations

  • Users should be aware of potential biases in the generated content and review questions for fairness and inclusivity.
  • The model should not be used to generate misleading or factually incorrect information.
  • Respect copyright and intellectual property rights when using generated content.

Citation

If you use this model in your research or applications, please cite it as follows:

[Citation information to be added]

Contact

For questions, feedback, or support, please contact [Your Contact Information].

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .