cenkersisman's picture
Update README.md
512d709 verified
|
raw
history blame
4.29 kB
---
library_name: peft
base_model: mistralai/Mistral-7B-v0.1
---
# Model Card for peft with cenkersisman/TurkishWikipedia-LLM-7b-base
**Library name:** peft
**Base model:** mistralai/Mistral-7B-v0.1
**Model Description:**
This model was fine-tuned on Turkish Wikipedia texts using the peft library with Lora configuration.
**Developed by:** [More Information Needed]
**Funded by:** [Optional]: [More Information Needed]
**Shared by:** [Optional]: [More Information Needed]
**Model type:** Fine-tuned language model
**Language(s) (NLP):** Turkish
**License:** [More Information Needed]
**Finetuned from model:** mistralai/Mistral-7B-v0.1
**Model Sources:**
- **Repository:** [More Information Needed]
- **Paper:** [Optional]: [More Information Needed]
- **Demo:** [Optional]: [To be implemented]
## Uses
**Direct Use**
This model can be used for various NLP tasks, including:
- Text generation
- Machine translation
- Question answering
- Text summarization
**Downstream Use**
[More Information Needed]
## Bias, Risks, and Limitations
- **Bias:** The model may inherit biases from the training data, which is Wikipedia text. Biases could include cultural biases or biases in how information is presented on Wikipedia.
- **Risks:** The model may generate text that is offensive, misleading, or factually incorrect. It is important to be aware of these risks and to use the model responsibly.
- **Limitations:** The model may not perform well on all tasks, and it may not be able to generate text that is creative or original.
## Recommendations
- Users (both direct and downstream) should be aware of the risks, biases and limitations of the model.
- It is important to evaluate the outputs of the model carefully before using them in any application.
## How to Get Started with the Model
The following code snippet demonstrates how to load the fine-tuned model and generate text:
Python
```
from transformers import AutoModelForCausalLM, LlamaTokenizer, pipeline
# Load the model and tokenizer
folder = "cenkersisman/TurkishWikipedia-LLM-7b-base"
device = "cuda"
model = AutoModelForCausalLM.from_pretrained(folder).to(device)
tokenizer = LlamaTokenizer.from_pretrained(folder)
# Create a pipeline for text generation
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map=device, max_new_tokens=128, return_full_text=True, repetition_penalty=1.1)
# Generate text
def generate_output(user_query):
outputs = pipe(user_query, do_sample=True, temperature=0.1, top_k=10, top_p=0.9)
return outputs[0]["generated_text"]
# Example usage
user_query = "brezilya'nın nüfus olarak dünyanın en büyük"
output = generate_output(user_query)
print(output)
```
This code will load the fine-tuned model from the "cenkersisman/TurkishWikipedia-LLM-7b-base", create a pipeline for text generation, and then generate text based on the provided user query.
## Training Details
**Training Data**
- 9 million sentences from Turkish Wikipedia.
**Training Procedure**
- **Preprocessing:** The data was preprocessed by tokenizing the text and adding special tokens.
- **Training Hyperparameters**
- Training regime: Fine-tuning with Lora configuration
- Speeds, Sizes, Times: [More Information Needed]
**Evaluation**
- Testing Data, Factors & Metrics: [More Information Needed]
- **Results:** [More Information Needed]
## Summary
- This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 trained on Turkish Wikipedia text.
- The model can be used for various NLP tasks, including text generation.
- It is important to be aware of the risks, biases, and limitations of the model before using it.
## Environmental Impact
- The environmental impact of training this model can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
## Technical Specifications
- **Model Architecture and Objective:**
- The model architecture is based on mistralai/Mistral-7B-v0.1.
- The objective of the fine-tuning process was to improve the model's