So I was really happy with V3.3 but I got some more advice on the tokenizers as it tended to run a little hot... this is an experiment, in thought I might be able to get better creativity with higher temps. Early testing is not very promising in terms of fixing it from running hot though. But in terms of an expanded vocab it seems a success!

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Linear DELLA merge method using meta-llama/Llama-3.3-70B-Instruct as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Sao10K/L3.1-70B-Hanami-x1
    parameters:
      weight: 0.20
      density: 0.7
  - model: Sao10K/70B-L3.3-Cirrus-x1
    parameters:
      weight: 0.20
      density: 0.7
  - model: SicariusSicariiStuff/Negative_LLAMA_70B
    parameters:
      weight: 0.20
      density: 0.7
  - model: TheDrummer/Anubis-70B-v1
    parameters:
      weight: 0.20
      density: 0.7
  - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
    parameters:
      weight: 0.20
      density: 0.7
merge_method: della_linear
base_model: meta-llama/Llama-3.3-70B-Instruct
parameters:
  epsilon: 0.2
  lambda: 1.1
dype: float32
out_dtype: bfloat16
tokenizer:
 source: Sao10K/L3.1-70B-Hanami-x1
Downloads last month
44
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for TareksLab/Progenitor-V3.4-LLaMa-70B