Preview of dataset trained on: https://huggingface.co/datasets/manishiitg/aditi-syn-v2

The synthetic dataset (https://huggingface.co/datasets/manishiitg/aditi-syn-v2) and the full data creation pipeline (https://github.com/manishiitg/aditi_dataset) have been open-sourced, enabling transparency and fostering further research in this domain. The dataset is a rich tapestry of Hinglish (a blend of Hindi and English) data, as well as a diverse array of tasks spanning tools, retrieval-augmented generation (RAG), mathematics, and reasoning – all in the Hindi language.

LMJudge Eval

https://github.com/manishiitg/IndicLMJudge

LLM Judge Language: hi

Model Language Score No# Questions
mistralai/Mixtral-8x7B-Instruct-v0.1 hi 8.7148 554
Qwen/Qwen1.5-72B-Chat-AWQ hi 8.3695 554
manishiitg/open-aditi-v6-llama3 hi 8.2659 551
Qwen/Qwen1.5-14B-Chat hi 8.2404 554
google/gemma-7b-it hi 7.9152 554
manishiitg/open-aditi-v6-gemma hi 7.8634 549
Qwen/Qwen1.5-7B-Chat hi 7.8587 554
manishiitg/open-aditi-hi-v3 hi 7.7644 554
manishiitg/open-aditi-hi-v4 hi 7.6150 554
manishiitg/open-aditi-hi-v2 hi 7.2518 554
teknium/OpenHermes-2.5-Mistral-7B hi 7.2489 554
ai4bharat/Airavata hi 6.9468 554
01-ai/Yi-34B-Chat hi 6.5801 554
manishiitg/open-aditi-hi-v1 hi 4.7022 554
sarvamai/OpenHathi-7B-Hi-v0.1-Base hi 4.2834 598
Qwen/Qwen1.5-4B-Chat hi 4.1101 554

LLM Judge Language: en

Model Language Score No# Questions
Qwen/Qwen1.5-14B-Chat en 9.1947 356
Qwen/Qwen1.5-72B-Chat-AWQ en 9.1618 356
Qwen/Qwen1.5-7B-Chat en 9.1570 356
01-ai/Yi-34B-Chat en 9.1368 356
mistralai/Mixtral-8x7B-Instruct-v0.1 en 9.1306 356
manishiitg/open-aditi-v6-gemma en 9.1003 356
teknium/OpenHermes-2.5-Mistral-7B en 9.0230 356
manishiitg/open-aditi-v6-llama3 en 9.0197 356
manishiitg/open-aditi-hi-v3 en 8.9615 356
manishiitg/open-aditi-hi-v4 en 8.9188 356
google/gemma-7b-it en 8.8191 356
Qwen/Qwen1.5-4B-Chat en 8.7500 356
google/gemma-2b-it en 8.4671 356
manishiitg/open-aditi-hi-v2 en 8.4584 356
ai4bharat/Airavata en 7.3834 356
manishiitg/open-aditi-hi-v1 en 6.6559 356
sarvamai/OpenHathi-7B-Hi-v0.1-Base en 5.9567 312

DHARMA TINY EVAL

Language Hi

Model ARC-Easy bigbench truthful_qa BoolQ winogrande agieval ARC-Challenge MMLU openbookqa
open-aditi-hi-v2 0.6245 0.4959 0.3866 0.7192 0.5353 0.2945 0.4828 0.3457 0.5279
open-aditi-hi-v3 0.6803 0.4553 0.2788 0.7385 0.5390 0.2178 0.4914 0.3346 0.5688
open-aditi-hi-v4 0.6989 0.4526 0.2714 0.7231 0.5167 0.2331 0.5302 0.3123 0.5316
open-aditi-v6-gemma 0.7212 0.4146 0.3234 0.6923 0.4870 0.2638 0.4957 0.3680 0.4349
open-aditi-v6-llama3 0.5688 0.4119 0.2268 0.6500 0.4498 0.2331 0.4310 0.3420 0.3792
open-aditi-hi-v1 0.4572 0.3767 0.2230 0.6346 0.4647 0.1840 0.3405 0.3271 0.3532
OpenHermes-2.5-Mistral-7B 0.3309 0.4201 0.3197 0.6077 0.4981 0.2331 0.3276 0.3086 0.3086
OpenHathi-7B-Hi-v0.1-Base 0.2862 0.3333 0.5130 0.6077 0.4907 0.2301 0.3017 0.2677 0.1933
Airavata 0.2751 0.1274 0.2268 0.0615 0.3866 0.1104 0.2845 0.1450 0.3383
gemma-7b-it 0.1227 0.0786 0.0743 0.1808 0.1561 0.0491 0.1078 0.0818 0.0855

Language En

Model ARC-Easy bigbench truthful_qa BoolQ winogrande agieval ARC-Challenge MMLU openbookqa
OpenHermes-2.5-Mistral-7B 0.8922 0.5745 0.3197 0.8346 0.6989 0.4908 0.7802 0.5911 0.7621
open-aditi-hi-v2 0.8625 0.5149 0.3532 0.8192 0.6877 0.4571 0.7500 0.5613 0.7732
open-aditi-hi-v4 0.8959 0.5041 0.2862 0.8423 0.6914 0.4571 0.7716 0.5651 0.7138
open-aditi-hi-v3 0.8773 0.4986 0.3048 0.8385 0.6766 0.4663 0.7371 0.5613 0.7249
Qwen1.5-7B-Chat 0.8922 0.5122 0.2007 0.8000 0.6654 0.4294 0.7759 0.5799 0.7621
open-aditi-v6-gemma 0.8699 0.4959 0.2602 0.7385 0.5465 0.4540 0.7371 0.5167 0.6654
open-aditi-v6-llama3 0.8810 0.4634 0.1822 0.7577 0.5353 0.4110 0.7457 0.5688 0.6506
open-aditi-hi-v1 0.8104 0.3902 0.2491 0.6962 0.5539 0.3681 0.6379 0.5056 0.5911
Airavata 0.7026 0.4282 0.3123 0.7192 0.5651 0.3313 0.5172 0.3792 0.5093
OpenHathi-7B-Hi-v0.1-Base 0.4684 0.3062 0.4758 0.6346 0.5167 0.2577 0.3017 0.2788 0.2714

Task: BoolQ Metric: score

Task: ARC-Easy Metric: score

Task: openbookqa Metric: score

Task: winogrande Metric: score

Task: ARC-Challenge Metric: score

Task: truthful_qa Metric: score

Task: bigbench Metric: score

Task: MMLU Metric: score

Task: agieval Metric: score

Built with Axolotl

See axolotl config

axolotl version: 0.4.0

base_model: google/gemma-7B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
tokenizer_config: philschmid/gemma-tokenizer-chatml
tokenizer_use_fast: true

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: manishiitg/aditi-syn-train-small-v3
    type: completion
  

# 25 has only sythentic data, and has judge removed data 
hub_model_id: manishiitg/open-aditi-chat-hi-1.25-gemma
hf_use_auth_token: true

wandb_project: open-aditi-chat-hi-1.25-gemma

dataset_prepared_path: manishiitg
push_dataset_to_hub: manishiitg
val_set_size: .1
output_dir: /sky-notebook/manishiitg/open-aditi-chat-hi-1.25-gemma

adapter: qlora
lora_model_dir:
save_safetensors: true

sequence_len: 2048
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true

wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 4
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002

adam_beta2: 0.95
adam_epsilon: 0.00001
max_grad_norm: 1.0

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false


gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
auto_resume_from_checkpoints: true ## manage check point resume from here
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_table_max_new_tokens: 128
save_steps: 20 ## increase based on your dataset
save_strategy: steps
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:

open-aditi-chat-hi-1.25-gemma

This model is a fine-tuned version of google/gemma-7B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0992

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
2.8213 0.0 1 8.4429
0.9759 0.5 121 2.0992

Framework versions

  • PEFT 0.9.0
  • Transformers 4.40.0.dev0
  • Pytorch 2.1.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
8.54B params
Tensor type
BF16
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Collection including manishiitg/open-aditi-v6-gemma