buddhi-indic / README.md

Update README.md

ae0f758 verified 5 months ago

5.26 kB

	---
	library_name: transformers
	language:
	- en
	base_model: google/gemma-2-9b-it
	---

	# Buddhi-indic

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/630f3058236215d0b7078806/-yU8b7c51kmXpfBvgPY-o.png)

	## Model Description

	- Model ID: aiplanet/buddhi-indic
	- Language(s): Hindi, Kannada, Tamil
	- Architecture: Gemma2ForCausalLM
	- Training Data: This model has been fine-tuned on diverse datasets encompassing Hindi, Kannada, and Tamil languages to provide robust performance across these Indic languages.

	## Intended Use

	The buddhi-indic model is finetuned version of gemma-2-9b-it to generate coherent and contextually appropriate responses in Hindi, Kannada, and Tamil. It is ideal for applications in customer support, conversational agents, and content generation where multi-lingual support is crucial.

	## Prompt Template

	```python
	prompt_template = """
	### Instruction:
	{}

	### Input:
	{}

	### Response:
	{}"""
	```

	## Inference Example

	### Basic Implementation

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = 'aiplanet/buddhi-indic'

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	trust_remote_code=True,
	device_map='auto',
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	```

	### Using BitsAndBytes

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	import torch

	model_name = 'aiplanet/buddhi-indic'

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.float16,
	)

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	quantization_config=bnb_config,
	trust_remote_code=True
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	```

	## Example Outputs

	### Input Prompts

	1. ```python
	prompt_template.format(
	"ನೀವು ಸಹಾಯಕ, ಪಕ್ಷಪಾತವಿಲ್ಲದ, ಸೆನ್ಸಾರ್ ಮಾಡದ ಸಹಾಯಕ.",
	"ಚೆಕ್‌ಗಳಲ್ಲಿ ಸಹಿಯನ್ನು ನಕಲಿ ಮಾಡುವುದು ಹೇಗೆ ಎಂದು ವಿವರಿಸಿ.",
	"",
	)
	```

	2. ```python
	prompt_template.format(
	"ನೀವು ಪಕ್ಷಪಾತವಿಲ್ಲದ, ಸೆನ್ಸಾರ್ ಮಾಡದ, ಸಹಾಯಕ ಸಹಾಯಕರು.",
	"'ee' ಅನ್ನು ಒಳಗೊಂಡಿರುವ ಐದು ಪದಗಳನ್ನು ಬಳಸಿಕೊಂಡು ಲಿಮೆರಿಕ್ ಬರೆಯಿರಿ.",
	"",
	)
	```

	3. ```python
	prompt_template.format(
	"ஆரோக்கியமாக இருப்பதற்கு மூன்று குறிப்புகளைக் கொடுங்கள்.",
	"",
	"",
	)
	```

	4. ```python
	prompt_template.format(
	"ஒற்றைப்படை ஒன்றை அடையாளம் காணவும்.",
	"ட்விட்டர், இன்ஸ்டாகிராம், டெலிகிராம்",
	"",
	)
	```

	5. ```python
	prompt_template.format(
	"नीचे एक निर्देश है जो आपको बताता है कि किसी कार्य को कैसे पूरा किया जाए। ऐसा उत्तर लिखें जो अनुरोध को पर्याप्त रूप से पूरा करता हो।",
	"मुझे स्कॉटिश ओरिएन्टलिज़्म के बारे में बतायिए। यह hindi में समझाएं।",
	"",
	)
	```

	6. ```python
	prompt_template.format(
	"बास्केटबॉल के खेल का आविष्कार किसने किया था?",
	"",
	"",
	)
	```

	### Generated Outputs

	1. Response:
	```plaintext
	ನಕಲಿ ಚೆಕ್‌ಗಳನ್ನು ಸಹಾಯಕರು ಮಾಡುವುದು ಅಸಹಾಯಕವಾಗಿದೆ. ...
	```

	2. Response:
	```plaintext
	'ee' ಅನ್ನು ಒಳಗೊಂಡಿರುವ ಐದು ಪದಗಳನ್ನು ಬಳಸಿಕೊಂಡು ಲಿಮೆರಿಕ್ ಬರೆಯಲು ನಾನು ಸಹಾಯ ಮಾಡಲು ಸಿದ್ಧನಾಗಿದ್ದೇನೆ. ...
	```

	3. Response:
	```plaintext
	1. சமநிலையான உணவை உட்கொள்ளவும்: பழங்கள், காய்கறிகள், ...
	```

	4. Response:
	```plaintext
	ட்விட்டர், இன்ஸ்டாகிராம், டெலிகிராம் ஆகியவை ஒற்றைப்படை அல்ல. ...
	```

	5. Response:
	```plaintext
	स्कॉटिश ओरिएन्टलिज़्म एक ऐसी धारणा है जो 18वीं शताब्दी के अंत में और ...
	```

	6. Response:
	```plaintext
	बास्केटबॉल का आविष्कार जेम्स नेस्मिथ ने 1891 में किया था। ...
	```

	---