Model Card for ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3
Model Details
Model Description
Ce modèle est une version hautement optimisée de Qwen-0.5B, spécialement conçue pour exceller dans le raisonnement multi-source (MUSR). Il représente la troisième version de notre architecture d'ensemble améliorée, atteignant des performances exceptionnelles sur le benchmark MUSR.
- Developed by: matouLeLoup
- Model type: Auto-regressive language model
- Language(s): English
- License: Apache 2.0
- Finetuned from model: Qwen/Qwen2-0.5B
Training and Evaluation
Training Data
- Base model: Qwen-0.5B
- Fine-tuning dataset: allenai/qasc
Evaluation Results
Tested on 500 samples from QASC validation set:
- Accuracy: 100%
- Confidence: 1.1167 (±0.0171)
- Source Usage: 99.72%
- Response Length: 170.5 words (±22.8)
- Reasoning Steps: 1.36 average
Confidence Distribution:
1.1 : 95.8%
- 1.0-1.1 : 4.2%
- <1.0 : 0%
Uses
Direct Use
Ce modèle est optimisé pour :
- Questions-réponses multi-sources
- Raisonnement logique
- Analyse et synthèse de documents
- Systèmes d'aide à la décision
- Applications éducatives
How to Get Started
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
tokenizer = AutoTokenizer.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
# Format de prompt optimal
prompt = f"""Context:
Fact 1: {fact1}
Fact 2: {fact2}
Question: {question}
Choices:
{choices}
Instructions:
1. Analyze both facts carefully
2. Connect the information
3. Choose the letter (A-H) that best answers the question
4. Explain your reasoning
Reasoned Answer:"""
# Génération
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
**inputs,
max_new_tokens=150,
num_beams=5,
temperature=0.6,
no_repeat_ngram_size=3
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Training details
Training Procedure
Training Hyperparameters
Learning rate: 2e-5
Batch size: 32
Weight decay: 0.1
Warmup steps: 0
Scheduler: polynomial
Training regime: bf16 mixed precision
# Evaluation Procedure
Tested on 500 random samples from QASC validation set
Evaluated for accuracy, confidence, and source usage
Detailed analysis of reasoning steps and response quality
# Limitations and Bias
Optimisé spécifiquement pour le format MUSR
Nécessite une structuration précise des prompts
Conçu pour des questions à choix multiples avec raisonnement
# Technical Specifications
Base model: Qwen-0.5B
Enhanced with optimized generation parameters
Uses letter-based answer format (A-H)
# Generation config
generation_config = {
"max_new_tokens": 150,
"num_beams": 5,
"temperature": 0.6,
"do_sample": False,
"length_penalty": 1.0,
"no_repeat_ngram_size": 3
}
@misc{PRYMMAL-EnhancedMUSREnsembleV3,
author = {matouLeLoup},
title = {ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face Hub},
howpublished = {\url{https://huggingface.co/matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}}
}
- Downloads last month
- 22
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.