Preview of dataset trained on: https://huggingface.co/datasets/manishiitg/aditi-syn-v2

The synthetic dataset (https://huggingface.co/datasets/manishiitg/aditi-syn-v2) and the full data creation pipeline (https://github.com/manishiitg/aditi_dataset) have been open-sourced, enabling transparency and fostering further research in this domain. The dataset is a rich tapestry of Hinglish (a blend of Hindi and English) data, as well as a diverse array of tasks spanning tools, retrieval-augmented generation (RAG), mathematics, and reasoning – all in the Hindi language.

LMJudge Eval

https://github.com/manishiitg/IndicLMJudge

LLM Judge Language: hi

Model	Language	Score	No# Questions
mistralai/Mixtral-8x7B-Instruct-v0.1	hi	8.7148	554
Qwen/Qwen1.5-72B-Chat-AWQ	hi	8.3695	554
manishiitg/open-aditi-v6-llama3	hi	8.2659	551
Qwen/Qwen1.5-14B-Chat	hi	8.2404	554
google/gemma-7b-it	hi	7.9152	554
manishiitg/open-aditi-v6-gemma	hi	7.8634	549
Qwen/Qwen1.5-7B-Chat	hi	7.8587	554
manishiitg/open-aditi-hi-v3	hi	7.7644	554
manishiitg/open-aditi-hi-v4	hi	7.6150	554
manishiitg/open-aditi-hi-v2	hi	7.2518	554
teknium/OpenHermes-2.5-Mistral-7B	hi	7.2489	554
ai4bharat/Airavata	hi	6.9468	554
01-ai/Yi-34B-Chat	hi	6.5801	554
manishiitg/open-aditi-hi-v1	hi	4.7022	554
sarvamai/OpenHathi-7B-Hi-v0.1-Base	hi	4.2834	598
Qwen/Qwen1.5-4B-Chat	hi	4.1101	554

LLM Judge Language: en

Model	Language	Score	No# Questions
Qwen/Qwen1.5-14B-Chat	en	9.1947	356
Qwen/Qwen1.5-72B-Chat-AWQ	en	9.1618	356
Qwen/Qwen1.5-7B-Chat	en	9.1570	356
01-ai/Yi-34B-Chat	en	9.1368	356
mistralai/Mixtral-8x7B-Instruct-v0.1	en	9.1306	356
manishiitg/open-aditi-v6-gemma	en	9.1003	356
teknium/OpenHermes-2.5-Mistral-7B	en	9.0230	356
manishiitg/open-aditi-v6-llama3	en	9.0197	356
manishiitg/open-aditi-hi-v3	en	8.9615	356
manishiitg/open-aditi-hi-v4	en	8.9188	356
google/gemma-7b-it	en	8.8191	356
Qwen/Qwen1.5-4B-Chat	en	8.7500	356
google/gemma-2b-it	en	8.4671	356
manishiitg/open-aditi-hi-v2	en	8.4584	356
ai4bharat/Airavata	en	7.3834	356
manishiitg/open-aditi-hi-v1	en	6.6559	356
sarvamai/OpenHathi-7B-Hi-v0.1-Base	en	5.9567	312

DHARMA TINY EVAL

Language Hi

Model	ARC-Easy	bigbench	truthful_qa	BoolQ	winogrande	agieval	ARC-Challenge	MMLU	openbookqa
open-aditi-hi-v2	0.6245	0.4959	0.3866	0.7192	0.5353	0.2945	0.4828	0.3457	0.5279
open-aditi-hi-v3	0.6803	0.4553	0.2788	0.7385	0.5390	0.2178	0.4914	0.3346	0.5688
open-aditi-hi-v4	0.6989	0.4526	0.2714	0.7231	0.5167	0.2331	0.5302	0.3123	0.5316
open-aditi-v6-gemma	0.7212	0.4146	0.3234	0.6923	0.4870	0.2638	0.4957	0.3680	0.4349
open-aditi-v6-llama3	0.5688	0.4119	0.2268	0.6500	0.4498	0.2331	0.4310	0.3420	0.3792
open-aditi-hi-v1	0.4572	0.3767	0.2230	0.6346	0.4647	0.1840	0.3405	0.3271	0.3532
OpenHermes-2.5-Mistral-7B	0.3309	0.4201	0.3197	0.6077	0.4981	0.2331	0.3276	0.3086	0.3086
OpenHathi-7B-Hi-v0.1-Base	0.2862	0.3333	0.5130	0.6077	0.4907	0.2301	0.3017	0.2677	0.1933
Airavata	0.2751	0.1274	0.2268	0.0615	0.3866	0.1104	0.2845	0.1450	0.3383
gemma-7b-it	0.1227	0.0786	0.0743	0.1808	0.1561	0.0491	0.1078	0.0818	0.0855

Language En

Model	ARC-Easy	bigbench	truthful_qa	BoolQ	winogrande	agieval	ARC-Challenge	MMLU	openbookqa
OpenHermes-2.5-Mistral-7B	0.8922	0.5745	0.3197	0.8346	0.6989	0.4908	0.7802	0.5911	0.7621
open-aditi-hi-v2	0.8625	0.5149	0.3532	0.8192	0.6877	0.4571	0.7500	0.5613	0.7732
open-aditi-hi-v4	0.8959	0.5041	0.2862	0.8423	0.6914	0.4571	0.7716	0.5651	0.7138
open-aditi-hi-v3	0.8773	0.4986	0.3048	0.8385	0.6766	0.4663	0.7371	0.5613	0.7249
Qwen1.5-7B-Chat	0.8922	0.5122	0.2007	0.8000	0.6654	0.4294	0.7759	0.5799	0.7621
open-aditi-v6-gemma	0.8699	0.4959	0.2602	0.7385	0.5465	0.4540	0.7371	0.5167	0.6654
open-aditi-v6-llama3	0.8810	0.4634	0.1822	0.7577	0.5353	0.4110	0.7457	0.5688	0.6506
open-aditi-hi-v1	0.8104	0.3902	0.2491	0.6962	0.5539	0.3681	0.6379	0.5056	0.5911
Airavata	0.7026	0.4282	0.3123	0.7192	0.5651	0.3313	0.5172	0.3792	0.5093
OpenHathi-7B-Hi-v0.1-Base	0.4684	0.3062	0.4758	0.6346	0.5167	0.2577	0.3017	0.2788	0.2714

Task: BoolQ Metric: score

Task: ARC-Easy Metric: score

Task: openbookqa Metric: score

Task: winogrande Metric: score

Task: ARC-Challenge Metric: score

Task: truthful_qa Metric: score

Task: bigbench Metric: score

Task: MMLU Metric: score

Task: agieval Metric: score

See axolotl config

axolotl version: 0.4.0

base_model: google/gemma-7B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
tokenizer_config: philschmid/gemma-tokenizer-chatml
tokenizer_use_fast: true

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: manishiitg/aditi-syn-train-small-v3
    type: completion
  

# 25 has only sythentic data, and has judge removed data 
hub_model_id: manishiitg/open-aditi-chat-hi-1.25-gemma
hf_use_auth_token: true

wandb_project: open-aditi-chat-hi-1.25-gemma

dataset_prepared_path: manishiitg
push_dataset_to_hub: manishiitg
val_set_size: .1
output_dir: /sky-notebook/manishiitg/open-aditi-chat-hi-1.25-gemma

adapter: qlora
lora_model_dir:
save_safetensors: true

sequence_len: 2048
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true

wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 4
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002

adam_beta2: 0.95
adam_epsilon: 0.00001
max_grad_norm: 1.0

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false


gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
auto_resume_from_checkpoints: true ## manage check point resume from here
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_table_max_new_tokens: 128
save_steps: 20 ## increase based on your dataset
save_strategy: steps
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:

open-aditi-chat-hi-1.25-gemma

This model is a fine-tuned version of google/gemma-7B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.0992

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 8
total_train_batch_size: 256
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
2.8213	0.0	1	8.4429
0.9759	0.5	121	2.0992

Framework versions

PEFT 0.9.0
Transformers 4.40.0.dev0
Pytorch 2.1.2+cu121
Datasets 2.18.0
Tokenizers 0.15.0

manishiitg
/

open-aditi-v6-gemma

LMJudge Eval

LLM Judge Language: hi

LLM Judge Language: en

DHARMA TINY EVAL

Language Hi

Language En

open-aditi-chat-hi-1.25-gemma

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Collection including manishiitg/open-aditi-v6-gemma

Aditi LLM

Evaluation results