|
--- |
|
license: llama2 |
|
datasets: |
|
- seeweb/Seeweb-it-292-forLLM |
|
language: |
|
- it |
|
--- |
|
# Model Card for seeweb/SeewebLLM-it |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
The model is a fine-tuned version of [LLama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) specialized into italian speaking. |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
<!-- **Developed by:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] --> |
|
- **Backbone Model**: [LLama2](https://github.com/facebookresearch/llama/tree/main) |
|
- **Language(s) :** Italian |
|
- **Finetuned from model: [LLama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)** |
|
- **Contributors**: [Lorenzo Rocchi](https://huggingface.co/itsrocchi) @ [Seeweb](https://www.seeweb.it/) |
|
|
|
<!-- ### Model Sources [optional] |
|
|
|
Provide the basic links for the model. |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] --> |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
The model may not produce 100% correct output sentences. |
|
|
|
### Training script |
|
|
|
The following repository contains scripts and instructions used for the finetuning and testing: |
|
|
|
**[https://github.com/itsrocchi/finetuning-llama2-ita.git](https://github.com/itsrocchi/finetuning-llama2-ita.git)** |
|
|
|
### Inference and confrontation w/LLama 2 |
|
|
|
here's a little python snippet to perform inference |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("itsrocchi/SeewebLLM-it-ver2") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"itsrocchi/SeewebLLM-it-ver2", |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
load_in_8bit=True, |
|
rope_scaling={"type": "dynamic", "factor": 2} |
|
) |
|
|
|
# eventualmente si possono modificare i parametri di model e tokenizer |
|
# inserendo il percorso assoluto della directory locale del modello |
|
|
|
prompt = "### User:\nDescrivi cos' è l'intelligenza artificiale\n\n### Assistant:\n" |
|
#modificare ciò che è scritto tra "User" ed "assistant per personalizzare il prompt" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) |
|
|
|
output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf')) |
|
output_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
``` |
|
|
|
#### Output finetuned model: |
|
|
|
``` |
|
L'intelligenza artificiale è un campo di studio che si occupa di creare sistemi che possono imparare, raffinare e prendere decisioni come gli esseri umani. Questi sistemi possono essere utilizzati in molte applicazioni, come la robotica, la visione artificiale, il machine learning e l'elaborazione del linguaggio naturale. |
|
``` |
|
|
|
#### Output LLama 2: |
|
|
|
``` |
|
Certainly, I'd be happy to help you understand what artificial intelligence (AI) is! |
|
Artificial intelligence (AI) refers to the development of computer systems that can perform tasks that typically require human intelligence, such as learning, problem-solving, decision-making, and perception. AI systems use algorithms and machine learning techniques to analyze data, make predictions, and take actions based on that data. |
|
There are several types of AI, including: |
|
|
|
Narrow or weak AI: This type of AI is designed to perform a specific task, such as playing chess or recognizing faces. Narrow AI is the most common form of AI and is used in many applications, such as virtual assistants, language translation, and image recognition. |
|
General or strong AI: This type of AI is designed to perform any intellectual task that a human can. General AI has the potential to revolutionize many industries, including healthcare, finance, and education. |
|
Superintelligence: This type of AI is significantly more intelligent than the best human minds. Superintelligence could potentially solve complex problems that are currently unsolvable, but it also raises concerns about safety and control. |
|
Some examples of AI include: |
|
Self-driving cars: These vehicles use a combination of sensors and machine learning algorithms to navigate roads and avoid obstacles. |
|
Virtual assistants: These AI-powered tools, such as Amazon's Alexa and Google Assistant, can perform tasks such as scheduling appointments, setting reminders, and controlling smart home devices. |
|
Language translation: AI systems can translate text and speech in real-time, allowing for more effective communication between people who speak different languages. |
|
It's important to note that AI is not a single entity, but rather a rapidly evolving field that involves many different techniques and technologies. As AI continues to advance, it has the potential to transform many aspects of our lives, from the way we work and interact with each other to the way we make decisions and solve problems. |
|
``` |
|
|
|
#### Conclusion |
|
|
|
LLama 2 can understand the question and gives the user a very specific and overall a better answer compared to the one given by the fine-tuned model. However the fine-tuned model answers with a sentence written in a perfect italian, which is what we were trying to achieve with this fine-tuning process. |
|
|
|
### Training Data and Details |
|
|
|
The dataset used is [seeweb/Seeweb-it-292-forLLM](https://huggingface.co/datasets/seeweb/Seeweb-it-292-forLLM), a dataset containing approx. 300 italian prompt-answer conversations. |
|
|
|
The training has been made on RTX A6000, inside [Seeweb's Cloud Server GPU](https://www.seeweb.it/en/products/cloud-server-gpu) |
|
|
|
### What next? |
|
|
|
The model must be improved: a much bigger dataset needs to be created so that the model can learn many more ways to answer. |