# Malaysian Llama2 Sentiment Analysis Model (GGUF Version)

## Overview

This repository contains a GGUF (GPT-Generated Unified Format) version of the [kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2](https://huggingface.co/kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2) model, specifically adapted for sentiment analysis of Malay text from social media. This GGUF version allows for efficient inference on various platforms and devices.

## Model Details

- **Original Model**: [kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2](https://huggingface.co/kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2)
- **Base Model**: [mesolitica/malaysian-llama2-7b-32k-instructions-v2](https://huggingface.co/mesolitica/malaysian-llama2-7b-32k-instructions-v2)
  - This is a full parameter fine-tuning of Llama2-7B with a 32k context length on a Malaysian instructions dataset.
  - The base model uses the exact Llama2 chat template.
- **Fine-tuning Dataset**: [kaiimran/malaysia-tweets-sentiment](https://huggingface.co/datasets/kaiimran/malaysia-tweets-sentiment)
- **Fine-tuning Process**: Based on the tutorial available [here](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing)

## Usage

### Prompt

```
Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
———
### Teks: tulis teks, tweet, atau ayat yang anda ingin analisa di ruangan ini.
———
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".

Sentimen:
````

Example:
```
Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
———
### Teks: alhamdulillah terima kasih sis support saya 🥹
———
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".

Sentimen:
```

### 1. Using with llama.cpp

1. Clone the llama.cpp repository and build it:
   ```
   git clone https://github.com/ggerganov/llama.cpp.git
   cd llama.cpp
   make
   ```

2. Download the GGUF model file from this repository.

3. Run inference using the following command:
   ```
   ./main -m path/to/your/model.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
   ```

   **Replace `path/to/your/model.gguf` with the actual path to this downloaded GGUF file.**

### 2. Using with UI-based Systems

This GGUF model can be used with various UI-based systems for an easier, more user-friendly experience:

1. **GPT4All**:
   - Download GPT4All from [https://gpt4all.io/](https://gpt4all.io/)
   - In the application, go to "Model Explorer"
   - Click on "Add your own GGUF model"
   - Select the downloaded GGUF file
   - Start chatting with the model

2. **Jan.AI**:
   - Download Jan.AI from [https://jan.ai/](https://jan.ai/)
   - In the application, go to the Models section
   - Click on "Add Model" and select "Import local model"
   - Choose the downloaded GGUF file
   - Once imported, you can start using the model in conversations

3. **Ollama**:
   - Install Ollama from [https://ollama.ai/](https://ollama.ai/)
   - Create a custom model file (e.g., `malaysian-sentiment.Ollama`) with the following content:
     ```
     FROM /path/to/your/model.gguf
     ```
   - **Replace `/path/to/your/model.gguf` with the actual path to this downloaded GGUF file.**
   - Run the command: `ollama create malaysian-sentiment -f malaysian-sentiment.Ollama`
   - Start chatting with: `ollama run malaysian-sentiment`

### 3. Using directly with Python (unsloth library)

For those who prefer using Python, you can use the following code to load and run inference with the model:

```python
from unsloth import FastLanguageModel

# Model configuration
max_seq_length = 4096  # Extended from TinyLlama's 2048 using RoPE Scaling
dtype = None  # Auto-detection (Float16 for Tesla T4, V100; Bfloat16 for Ampere+)
load_in_4bit = True  # Use 4-bit quantization to reduce memory usage

# Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

# Enable faster inference
FastLanguageModel.for_inference(model)

# Prepare the prompt template
alpaca_prompt = """Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
———
### Teks: {}
———
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".
Sentimen:
{}"""

# Example tweet for analysis
tweet = """
alhamdulillah terima kasih sis support saya ☺️ semoga sis dimurahkan rezeki dipanjangkan usia dan dipermudahkan segala urusan https://t.co/nSfNPGpiW8
"""

# Tokenize input
inputs = tokenizer(
    [alpaca_prompt.format(tweet, "")],
    return_tensors="pt"
).to("cuda")

# Generate output
outputs = model.generate(**inputs, max_new_tokens=10, use_cache=True)

# Print result
print(tokenizer.batch_decode(outputs)[0])
```

## Notes

- This model is specifically trained for sentiment analysis of Malay text from social media.
- The model uses RoPE Scaling to extend the context length from 2048 to 4096 tokens.
- 4-bit quantization is used by default to reduce memory usage, but this can be adjusted.
- The GGUF format allows for efficient inference on various platforms and devices.

## Contributing

Feel free to open issues or submit pull requests if you have suggestions for improvements or encounter any problems.

## Acknowledgements

- Thanks to the creators of the base model and the Malaysian tweets sentiment dataset.
- This project was inspired by and follows the methodology outlined in [this tutorial](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing).
- Also thanks to the developers of llama.cpp, GPT4All, Jan.AI, and Ollama for providing user-friendly interfaces to non-coders for running GGUF models.