|
--- |
|
base_model: unsloth/llama-3-8b-bnb-4bit |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
- sft |
|
--- |
|
|
|
--- |
|
{} |
|
--- |
|
|
|
# iLLAMA: LLM for App Issue Detection and Prioritization Obtained by Fine-Tuning LLAMA 3 |
|
|
|
This repository contains a fine-tuned version of LLAMA 3 using the Unsloth framework and the vitormesaque/irisk dataset. The model is designed for detecting issues in text data. |
|
|
|
## Model Details |
|
|
|
- **Developed by:** [Vitor Mesaque](https://huggingface.co/vitormesaque) |
|
- **Model type:** App Issue Detection Model |
|
- **Language:** English |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit |
|
- **Datasets:** [vitormesaque/irisk](https://huggingface.co/datasets/vitormesaque/irisk) |
|
|
|
|
|
The vitormesaque/irisk dataset was obtained through the knowledge base of the MApp-IDEA research project. |
|
|
|
## Model Usage |
|
|
|
|
|
### How to Get Started with the Model |
|
|
|
Use the code below to get started with the model: |
|
|
|
```python |
|
|
|
# Load model directly |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("vitormesaque/i-llama") |
|
model = AutoModelForCausalLM.from_pretrained("vitormesaque/i-llama") |
|
|
|
``` |
|
## Usage |
|
|
|
```python |
|
|
|
|
|
FastLanguageModel.for_inference(model) # Enable native 2x faster inference |
|
inputs = tokenizer( |
|
[ |
|
irisk_prompt.format( |
|
"Extract issues from the user review in JSON format. For each issue, provide label, functionality, severity (1-5), likelihood (1-5), category (Bug, User Experience, Performance, Security, Compatibility, Functionality, UI, Connectivity, Localization, Accessibility, Data Handling, Privacy, Notifications, Account Management, Payment, Content Quality, Support, Updates, Syncing, Customization), and the sentence.", # instruction |
|
"I used to love this app, but now it's become frustrating as hell. We can't see lyrics, we can't CHOOSE WHAT SONG WE WANT TO LISTEN TO, we can't skip a song more than a few times, there are ads after every two songs, and all in all it's a horrible overrated app. If I could give this 0 stars, I would.", # input |
|
"", # output - leave this blank for generation! |
|
) |
|
], return_tensors = "pt").to("cuda") |
|
|
|
|
|
from transformers import TextStreamer |
|
text_streamer = TextStreamer(tokenizer) |
|
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 512) |
|
|
|
|
|
``` |
|
### Evaluation |
|
|
|
The model was evaluated using a separate portion of the vitormesaque/irisk dataset. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
While the model is effective in detecting issues, it may exhibit biases present in the training data. Users should be aware of these potential biases and consider them when interpreting results. |
|
|
|
### Recommendations |
|
|
|
Users should conduct additional evaluations in the specific context of use to ensure reliability and fairness. |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite it as follows: |
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@misc{vitormesaque2024llama3, |
|
author = {Vitor Mesaque Alves de Lima}, |
|
title = {iLLAMA: LLM for App Issue Detection and Prioritization Obtained by Fine-Tuning LLAMA 3}, |
|
year = {2024}, |
|
url = {https://huggingface.co/vitormesaque} |
|
} |
|
``` |
|
|
|
**APA:** |
|
|
|
Mesaque, V. (2024). LLAMA 3 fine-tuned with Unsloth and vitormesaque/irisk dataset. Retrieved from https://huggingface.co/vitormesaque |
|
|
|
## License |
|
|
|
This model is licensed under the MIT License. |
|
|
|
## Contact |
|
|
|
For questions or comments, please contact [Vitor Mesaque](https://huggingface.co/vitormesaque). |
|
|
|
|
|
|