|
--- |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
--- |
|
# Model Card for TurkishWikipedia-LLM-7b-base |
|
|
|
**Library name:** peft |
|
|
|
**Base model:** mistralai/Mistral-7B-v0.1 |
|
|
|
**Model Description:** |
|
|
|
This model was fine-tuned on Turkish Wikipedia texts using the peft library with Lora configuration. |
|
The training is at %40 of the first epoch with loss value of 1.30 |
|
|
|
**Developed by:** [More Information Needed] |
|
|
|
**Funded by:** [Optional]: [More Information Needed] |
|
|
|
**Shared by:** [Optional]: [More Information Needed] |
|
|
|
**Model type:** Fine-tuned language model |
|
|
|
**Language(s) (NLP):** Turkish |
|
|
|
**License:** [More Information Needed] |
|
|
|
**Finetuned from model:** mistralai/Mistral-7B-v0.1 |
|
|
|
**Model Sources:** |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper:** [Optional]: [More Information Needed] |
|
- **Demo:** [Optional]: [To be implemented] |
|
|
|
## Uses |
|
|
|
**Direct Use** |
|
|
|
This model can be used for various NLP tasks, including: |
|
|
|
- Text generation |
|
- Machine translation |
|
- Question answering |
|
- Text summarization |
|
|
|
**Downstream Use** |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- **Bias:** The model may inherit biases from the training data, which is Wikipedia text. Biases could include cultural biases or biases in how information is presented on Wikipedia. |
|
- **Risks:** The model may generate text that is offensive, misleading, or factually incorrect. It is important to be aware of these risks and to use the model responsibly. |
|
- **Limitations:** The model may not perform well on all tasks, and it may not be able to generate text that is creative or original. |
|
|
|
## Recommendations |
|
|
|
- Users (both direct and downstream) should be aware of the risks, biases and limitations of the model. |
|
- It is important to evaluate the outputs of the model carefully before using them in any application. |
|
|
|
## How to Get Started with the Model |
|
|
|
The following code snippet demonstrates how to load the fine-tuned model and generate text: |
|
|
|
Python |
|
|
|
``` |
|
from transformers import AutoModelForCausalLM, LlamaTokenizer, pipeline |
|
|
|
# Load the model and tokenizer |
|
folder = "cenkersisman/TurkishWikipedia-LLM-7b-base" |
|
device = "cuda" |
|
model = AutoModelForCausalLM.from_pretrained(folder).to(device) |
|
tokenizer = LlamaTokenizer.from_pretrained(folder) |
|
|
|
# Create a pipeline for text generation |
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map=device, max_new_tokens=128, return_full_text=True, repetition_penalty=1.1) |
|
|
|
# Generate text |
|
def generate_output(user_query): |
|
outputs = pipe(user_query, do_sample=True, temperature=0.1, top_k=10, top_p=0.9) |
|
return outputs[0]["generated_text"] |
|
|
|
# Example usage |
|
user_query = "brezilya'nın nüfus olarak dünyanın en büyük" |
|
output = generate_output(user_query) |
|
print(output) |
|
``` |
|
|
|
This code will load the fine-tuned model from the "cenkersisman/TurkishWikipedia-LLM-7b-base", create a pipeline for text generation, and then generate text based on the provided user query. |
|
|
|
## Training Details |
|
|
|
**Training Data** |
|
|
|
- 9 million sentences from Turkish Wikipedia. |
|
|
|
**Training Procedure** |
|
|
|
- **Preprocessing:** The data was preprocessed by tokenizing the text and adding special tokens. |
|
|
|
- **Training Hyperparameters** |
|
|
|
- Training regime: Fine-tuning with Lora configuration |
|
- Speeds, Sizes, Times: [More Information Needed] |
|
|
|
**Evaluation** |
|
|
|
- Testing Data, Factors & Metrics: [More Information Needed] |
|
|
|
- **Results:** [More Information Needed] |
|
|
|
|
|
## Summary |
|
|
|
- This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 trained on Turkish Wikipedia text. |
|
- The model can be used for various NLP tasks, including text generation. |
|
- It is important to be aware of the risks, biases, and limitations of the model before using it. |
|
|
|
## Environmental Impact |
|
|
|
- The environmental impact of training this model can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). |
|
|
|
- Hardware Type: [More Information Needed] |
|
|
|
- Hours used: [More Information Needed] |
|
|
|
- Cloud Provider: [More Information Needed] |
|
|
|
- Compute Region: [More Information Needed] |
|
|
|
- Carbon Emitted: [More Information Needed] |
|
|
|
|
|
## Technical Specifications |
|
|
|
- **Model Architecture and Objective:** |
|
- The model architecture is based on mistralai/Mistral-7B-v0.1. |
|
- The objective of the fine-tuning process was to improve the model's |