RA_Reasoner / README.md
Daemontatox's picture
Update README.md
910de83 verified
|
raw
history blame
2.41 kB
metadata
base_model: tiiuae/Falcon3-10B-Instruct
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
license: apache-2.0
language:
  - en
pipeline_tag: text-generation
library_name: transformers

Uploaded Model

Developed by: Daemontatox

License: Apache 2.0

Finetuned from model: tiiuae/Falcon3-10B-Instruct

This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with Unsloth and Hugging Face's TRL library.

This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model.

Training Details

This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks.

Intended Use

This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions.

Focus on Reasoning: The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.

Performance Metrics

RA_Reasoner achieves 15% higher scores than ChatGPT-O1 Mini on key benchmarks:

Benchmark Metric RA_Reasoner ChatGPT-O1 Mini Improvement
MMLU Average Accuracy 0.495 0.43 +15%
BigBench Hard Average Accuracy 0.414 0.36 +15%
HellaSwag Average Accuracy 0.805 0.70 +15%
GSM8k Average Accuracy 0.322 0.28 +15%

These benchmarks highlight RA_Reasoner's superior performance in reasoning, logic, and understanding tasks.