Evgeniy Hristoforu's picture

Evgeniy Hristoforu

ehristoforu

AI & ML interests

Diffusers, LLM and others ML.

Recent Activity

Organizations

video-p2p-library's profile picture Open-Source AI Meetup's profile picture lora concepts library's profile picture Tune a video concepts library's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Blog-explorers's profile picture OpenSky's profile picture DreamDrop AI's profile picture ZeroGPU Explorers's profile picture Project Fluently's profile picture test-org's profile picture Project Dreamly's profile picture ReLP's profile picture TensorMistral's profile picture TensorLlama's profile picture MLX Community's profile picture Theme repo for Gradio's profile picture GuardAI (GAI)'s profile picture Public League AI's profile picture Social Post Explorers's profile picture MiniSearch's profile picture Project Fluently LM's profile picture Dev Mode Explorers's profile picture Different LLM Sizes's profile picture Different LLM's profile picture ChatGPT Community (Unoffical, Non-profit)'s profile picture Stable Diffusion Community (Unofficial)'s profile picture Stable Diffusion Community (Unofficial, Non-profit)'s profile picture Synergetic AI's profile picture Diffusers Testing's profile picture LoRA Factory's profile picture Hugging Face Discord Community's profile picture Enhanced By Ehristoforu (EBE)'s profile picture LM Layers (Research)'s profile picture Fluently Lab.'s profile picture Puregen AI's profile picture Experiments with LLMs's profile picture Gramota.AI - Russian language's profile picture Need For LLM (NFL Project)'s profile picture Fluently Datasets's profile picture

ehristoforu's activity

reacted to their post with 🤗 2 days ago
view post
Post
2620
✒️ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🤯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

🤗 For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.
posted an update 2 days ago
view post
Post
2620
✒️ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🤯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

🤗 For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.