Model Card for ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3

Model Details

Model Description

Ce modèle est une version hautement optimisée de Qwen-0.5B, spécialement conçue pour exceller dans le raisonnement multi-source (MUSR). Il représente la troisième version de notre architecture d'ensemble améliorée, atteignant des performances exceptionnelles sur le benchmark MUSR.

  • Developed by: matouLeLoup
  • Model type: Auto-regressive language model
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: Qwen/Qwen2-0.5B

Training and Evaluation

Training Data

  • Base model: Qwen-0.5B
  • Fine-tuning dataset: allenai/qasc

Evaluation Results

Tested on 500 samples from QASC validation set:

  • Accuracy: 100%
  • Confidence: 1.1167 (±0.0171)
  • Source Usage: 99.72%
  • Response Length: 170.5 words (±22.8)
  • Reasoning Steps: 1.36 average

Confidence Distribution:

  • 1.1 : 95.8%

  • 1.0-1.1 : 4.2%
  • <1.0 : 0%

Uses

Direct Use

Ce modèle est optimisé pour :

  • Questions-réponses multi-sources
  • Raisonnement logique
  • Analyse et synthèse de documents
  • Systèmes d'aide à la décision
  • Applications éducatives

How to Get Started

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
tokenizer = AutoTokenizer.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")

# Format de prompt optimal
prompt = f"""Context:
Fact 1: {fact1}
Fact 2: {fact2}

Question: {question}

Choices:
{choices}

Instructions:
1. Analyze both facts carefully
2. Connect the information
3. Choose the letter (A-H) that best answers the question
4. Explain your reasoning

Reasoned Answer:"""

# Génération
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    num_beams=5,
    temperature=0.6,
    no_repeat_ngram_size=3
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Training details
Training Procedure
Training Hyperparameters

Learning rate: 2e-5
  Batch size: 32
  Weight decay: 0.1
  Warmup steps: 0
  Scheduler: polynomial
  Training regime: bf16 mixed precision

# Evaluation Procedure
  Tested on 500 random samples from QASC validation set
  Evaluated for accuracy, confidence, and source usage
  Detailed analysis of reasoning steps and response quality

# Limitations and Bias

  Optimisé spécifiquement pour le format MUSR
  Nécessite une structuration précise des prompts
  Conçu pour des questions à choix multiples avec raisonnement

# Technical Specifications
  Base model: Qwen-0.5B
  Enhanced with optimized generation parameters
  Uses letter-based answer format (A-H)

# Generation config
generation_config = {
    "max_new_tokens": 150,
    "num_beams": 5,
    "temperature": 0.6,
    "do_sample": False,
    "length_penalty": 1.0,
    "no_repeat_ngram_size": 3
}

@misc{PRYMMAL-EnhancedMUSREnsembleV3,
  author = {matouLeLoup},
  title = {ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face Hub},
  howpublished = {\url{https://huggingface.co/matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}}
}
Downloads last month
22
Safetensors
Model size
494M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train MEscriva/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3