Llama3.1-deep-o1

This is a merge of several DeepSeek R1 distilled and O1-style long chain-of-thought (CoT) large language models (LLMs). It is designed for generating long, coherent solutions and excels at problem-solving tasks among models with 8 billion parameters.

Model Overview

Key Features:

Generates detailed, manual-like explanations for complex questions.
Suitable for creating solution outlines, analyzing problems, and writing essays.

Limitations:

Does not follow standard CoT formats like <thought> tags.
Prone to calculation errors and careless mistakes in reasoning.
Struggles with multiturn conversations and user alignment.
For example, it may ask questions mid-response but continue answering regardless, in line with its CoT origins.

Merge Details

The model was created using the following YAML configuration:

models:
  - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
    parameters:
      weight: 1.5
  - model: Skywork/Skywork-o1-Open-Llama-3.1-8B
    parameters:
      weight: 1.5
  - model: NousResearch/DeepHermes-3-Llama-3-8B-Preview
  - model: O1-OPEN/OpenO1-LLama-8B-v0.1
  - model: SimpleBerry/LLaMA-O1-Supervised-1129
  - model: terrycraddock/Reflection-Llama-3.1-8B
merge_method: linear
parameters:
  weight: 1
dtype: bfloat16

Usage Recommendations

To generate hints for scientific problem solving
As a foundation model for finetuning and merging

Examples to Try:

Write the equations for glycolysis and pyruvate oxidation.
Calculate net ATP formation from glucose metabolism (excluding electron transport chain).
Integrate x^2 e^x dx.
Prove that the complete bipartite graph K_{3,3} isn't planar.
Derive a formula for the critical angle between two media with refractive indices n_1 and n_2.
Compare steam vs. diesel engines including their capabilities and historical significance.

Notes on Performance:

While the model provides coherent and expert-like responses, users should verify its outputs for accuracy - especially in calculations or logical reasoning tasks.

Warning

This model is experimental and may require careful validation when used for critical applications.
It is not optimized for conversational tasks but performs well in single-turn question answering.

agentlans
/

Llama3.1-deep-o1