TurkishWikipedia-LLM-7b-base / README.md

Update README.md

ea71f7b verified 7 months ago

4.31 kB

	---
	base_model: mistralai/Mistral-7B-v0.1
	---
	# Model Card for TurkishWikipedia-LLM-7b-base

	Library name: peft

	Base model: mistralai/Mistral-7B-v0.1

	Model Description:

	This model was fine-tuned on Turkish Wikipedia texts using the peft library with Lora configuration.
	The training is at %40 of the first epoch with loss value of 1.30

	Developed by: [More Information Needed]

	Funded by: [Optional]: [More Information Needed]

	Shared by: [Optional]: [More Information Needed]

	Model type: Fine-tuned language model

	Language(s) (NLP): Turkish

	License: [More Information Needed]

	Finetuned from model: mistralai/Mistral-7B-v0.1

	Model Sources:

	- Repository: [More Information Needed]
	- Paper: [Optional]: [More Information Needed]
	- Demo: [Optional]: [To be implemented]

	## Uses

	Direct Use

	This model can be used for various NLP tasks, including:

	- Text generation
	- Machine translation
	- Question answering
	- Text summarization

	Downstream Use

	[More Information Needed]

	## Bias, Risks, and Limitations

	- Bias: The model may inherit biases from the training data, which is Wikipedia text. Biases could include cultural biases or biases in how information is presented on Wikipedia.
	- Risks: The model may generate text that is offensive, misleading, or factually incorrect. It is important to be aware of these risks and to use the model responsibly.
	- Limitations: The model may not perform well on all tasks, and it may not be able to generate text that is creative or original.

	## Recommendations

	- Users (both direct and downstream) should be aware of the risks, biases and limitations of the model.
	- It is important to evaluate the outputs of the model carefully before using them in any application.

	## How to Get Started with the Model

	The following code snippet demonstrates how to load the fine-tuned model and generate text:

	Python

	```
	from transformers import AutoModelForCausalLM, LlamaTokenizer, pipeline

	# Load the model and tokenizer
	folder = "cenkersisman/TurkishWikipedia-LLM-7b-base"
	device = "cuda"
	model = AutoModelForCausalLM.from_pretrained(folder).to(device)
	tokenizer = LlamaTokenizer.from_pretrained(folder)

	# Create a pipeline for text generation
	pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map=device, max_new_tokens=128, return_full_text=True, repetition_penalty=1.1)

	# Generate text
	def generate_output(user_query):
	outputs = pipe(user_query, do_sample=True, temperature=0.1, top_k=10, top_p=0.9)
	return outputs[0]["generated_text"]

	# Example usage
	user_query = "brezilya'nın nüfus olarak dünyanın en büyük"
	output = generate_output(user_query)
	print(output)
	```

	This code will load the fine-tuned model from the "cenkersisman/TurkishWikipedia-LLM-7b-base", create a pipeline for text generation, and then generate text based on the provided user query.

	## Training Details

	Training Data

	- 9 million sentences from Turkish Wikipedia.

	Training Procedure

	- Preprocessing: The data was preprocessed by tokenizing the text and adding special tokens.

	- Training Hyperparameters

	- Training regime: Fine-tuning with Lora configuration
	- Speeds, Sizes, Times: [More Information Needed]

	Evaluation

	- Testing Data, Factors & Metrics: [More Information Needed]

	- Results: [More Information Needed]


	## Summary

	- This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 trained on Turkish Wikipedia text.
	- The model can be used for various NLP tasks, including text generation.
	- It is important to be aware of the risks, biases, and limitations of the model before using it.

	## Environmental Impact

	- The environmental impact of training this model can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

	- Hardware Type: [More Information Needed]

	- Hours used: [More Information Needed]

	- Cloud Provider: [More Information Needed]

	- Compute Region: [More Information Needed]

	- Carbon Emitted: [More Information Needed]


	## Technical Specifications

	- Model Architecture and Objective:
	- The model architecture is based on mistralai/Mistral-7B-v0.1.
	- The objective of the fine-tuning process was to improve the model's