# Malaysian Llama2 Sentiment Analysis Model (GGUF Version) ## Overview This repository contains a GGUF (GPT-Generated Unified Format) version of the [kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2](https://huggingface.co/kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2) model, specifically adapted for sentiment analysis of Malay text from social media. This GGUF version allows for efficient inference on various platforms and devices. ## Model Details - **Original Model**: [kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2](https://huggingface.co/kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2) - **Base Model**: [mesolitica/malaysian-llama2-7b-32k-instructions-v2](https://huggingface.co/mesolitica/malaysian-llama2-7b-32k-instructions-v2) - This is a full parameter fine-tuning of Llama2-7B with a 32k context length on a Malaysian instructions dataset. - The base model uses the exact Llama2 chat template. - **Fine-tuning Dataset**: [kaiimran/malaysia-tweets-sentiment](https://huggingface.co/datasets/kaiimran/malaysia-tweets-sentiment) - **Fine-tuning Process**: Based on the tutorial available [here](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing) ## Usage ### Prompt ``` Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut. ——— ### Teks: tulis teks, tweet, atau ayat yang anda ingin analisa di ruangan ini. ——— Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif. Jawab dengan hanya satu perkataan: "positif" atau "negatif". Sentimen: ```` Example: ``` Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut. ——— ### Teks: alhamdulillah terima kasih sis support saya 🥹 ——— Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif. Jawab dengan hanya satu perkataan: "positif" atau "negatif". Sentimen: ``` ### 1. Using with llama.cpp 1. Clone the llama.cpp repository and build it: ``` git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp make ``` 2. Download the GGUF model file from this repository. 3. Run inference using the following command: ``` ./main -m path/to/your/model.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt ``` **Replace `path/to/your/model.gguf` with the actual path to this downloaded GGUF file.** ### 2. Using with UI-based Systems This GGUF model can be used with various UI-based systems for an easier, more user-friendly experience: 1. **GPT4All**: - Download GPT4All from [https://gpt4all.io/](https://gpt4all.io/) - In the application, go to "Model Explorer" - Click on "Add your own GGUF model" - Select the downloaded GGUF file - Start chatting with the model 2. **Jan.AI**: - Download Jan.AI from [https://jan.ai/](https://jan.ai/) - In the application, go to the Models section - Click on "Add Model" and select "Import local model" - Choose the downloaded GGUF file - Once imported, you can start using the model in conversations 3. **Ollama**: - Install Ollama from [https://ollama.ai/](https://ollama.ai/) - Create a custom model file (e.g., `malaysian-sentiment.Ollama`) with the following content: ``` FROM /path/to/your/model.gguf ``` - **Replace `/path/to/your/model.gguf` with the actual path to this downloaded GGUF file.** - Run the command: `ollama create malaysian-sentiment -f malaysian-sentiment.Ollama` - Start chatting with: `ollama run malaysian-sentiment` ### 3. Using directly with Python (unsloth library) For those who prefer using Python, you can use the following code to load and run inference with the model: ```python from unsloth import FastLanguageModel # Model configuration max_seq_length = 4096 # Extended from TinyLlama's 2048 using RoPE Scaling dtype = None # Auto-detection (Float16 for Tesla T4, V100; Bfloat16 for Ampere+) load_in_4bit = True # Use 4-bit quantization to reduce memory usage # Load the model model, tokenizer = FastLanguageModel.from_pretrained( model_name="kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2", max_seq_length=max_seq_length, dtype=dtype, load_in_4bit=load_in_4bit, ) # Enable faster inference FastLanguageModel.for_inference(model) # Prepare the prompt template alpaca_prompt = """Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut. ——— ### Teks: {} ——— Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif. Jawab dengan hanya satu perkataan: "positif" atau "negatif". Sentimen: {}""" # Example tweet for analysis tweet = """ alhamdulillah terima kasih sis support saya ☺️ semoga sis dimurahkan rezeki dipanjangkan usia dan dipermudahkan segala urusan https://t.co/nSfNPGpiW8 """ # Tokenize input inputs = tokenizer( [alpaca_prompt.format(tweet, "")], return_tensors="pt" ).to("cuda") # Generate output outputs = model.generate(**inputs, max_new_tokens=10, use_cache=True) # Print result print(tokenizer.batch_decode(outputs)[0]) ``` ## Notes - This model is specifically trained for sentiment analysis of Malay text from social media. - The model uses RoPE Scaling to extend the context length from 2048 to 4096 tokens. - 4-bit quantization is used by default to reduce memory usage, but this can be adjusted. - The GGUF format allows for efficient inference on various platforms and devices. ## Contributing Feel free to open issues or submit pull requests if you have suggestions for improvements or encounter any problems. ## Acknowledgements - Thanks to the creators of the base model and the Malaysian tweets sentiment dataset. - This project was inspired by and follows the methodology outlined in [this tutorial](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing). - Also thanks to the developers of llama.cpp, GPT4All, Jan.AI, and Ollama for providing user-friendly interfaces to non-coders for running GGUF models.