File size: 2,951 Bytes
907f44a
 
19da76a
29b1778
 
791a74e
907f44a
 
96f5e13
 
15ee63a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96f5e13
 
 
 
 
 
 
 
 
 
 
 
 
 
5790c42
 
 
38dbee0
5790c42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38dbee0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
library_name: transformers
license: apache-2.0
base_model:
- MaziyarPanahi/calme-3.2-instruct-78b
- Sakalti/ultiima-72B
---

# **ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1**

This model has been produced by:
- **ROBERGE Marial**, engineering student at French Engineering School ECE
- **ESCRIVA Mathis**, engineering student at French Engineering School ECE
- **LALAIN Youri**, engineering student at French Engineering School ECE
- **RAGE LILIAN**, engineering student at French Engineering School ECE
- **HUVELLE Baptiste**, engineering student at French Engineering School ECE 

Under the supervision of:
- **Andre-Louis Rochet**, Lecturer at ECE & Co-Founder of TW3 Partners
- **Paul Lemaistre**, CTO of TW3 Partners

With the contribution of:
- **ECE engineering school** as sponsor and financial contributor
- **François STEPHAN** as director of ECE
- **Gérard REUS** as acting director of iLAB
- **Matthieu JOLLARD** ECE Alumni
- **Louis GARCIA** ECE Alumni

### Supervisory structure
The iLab (intelligence Lab) is a structure created by the ECE and dedicated to artificial intelligence

### About ECE
ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions.


**ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1** est un modèle de langage fusionné créé à partir des modèles **Sakalti/ultiima-72B** et **MaziyarPanahi/calme-3.2-instruct-78b**. Grâce à la méthode **SLERP (Spherical Linear Interpolation)**, il combine les forces des deux architectures pour offrir des performances optimales sur des tâches complexes de traitement du langage naturel (NLP).

## **Caractéristiques**
- **Méthode de fusion :** SLERP (Spherical Linear Interpolation).
- **Modèles sources :**
  - [Sakalti/ultiima-72B](https://huggingface.co/Sakalti/ultiima-72B)
  - [MaziyarPanahi/calme-3.2-instruct-78b](https://huggingface.co/MaziyarPanahi/calme-3.2-instruct-78b)
- **Points forts :**
  - Performances améliorées sur des tâches multi-domaines et de raisonnement.
  - Capacité de traitement étendue grâce à la fusion des couches critiques.
  - Optimisation en **bfloat16** pour des calculs rapides et efficaces.
- **Applications cibles :**
  - Raisonnement mathématique.
  - Compréhension contextuelle.
  - Tâches instructives (Instruction Following).

## **Configuration**
```yaml
slices:
  - sources:
      - model: MaziyarPanahi/calme-3-selfmerge-qwen2-78b
        layer_range: [0, 80]  # Limité à 80 couches
      - model: Qwen/Qwen2.5-72B
        layer_range: [0, 80]  # Correspondance avec le 78B
merge_method: slerp
base_model: MaziyarPanahi/calme-3-selfmerge-qwen2-78b
parameters:
  t:
    - filter: self_attn
      value: [0, 0.25, 0.5, 0.75, 1]
    - filter: mlp
      value: [1, 0.75, 0.5, 0.25, 0]
    - value: 0.5
dtype: bfloat16
```