its5Q PRO

its5Q

AI & ML interests

None yet

Recent Activity

Organizations

Vikhr models's profile picture Social Post Explorers's profile picture AI Starter Pack's profile picture

its5Q's activity

posted an update 11 days ago
view post
Post
2833
Am I missing something, or there is still no way to filter by model size while searching for models? It has been a requested feature since 2022, but I haven't seen any updates since! With the amount of different models coming out, I think the size filter would be a great extension of the search functionality, especially when looking for smaller models, which are a lot less prevalent.
  • 1 reply
·
posted an update 5 months ago
view post
Post
1331
Continuing my streak by releasing the Wikireading dataset: a large collection of scraped non-fiction books predominantly in Russian language.
its5Q/wikireading

Here's the highlights:
- ~7B tokens, or ~28B characters, making it a great candidate for use in pretraining
- Contains non-fiction works from many knowledge domains
- Includes both the original HTML and extracted text of book chapters
New activity in its5Q/wikireading 5 months ago

Update README.md

#1 opened 5 months ago by
its5Q
reacted to clem's post with 🔥 5 months ago
view post
Post
4134
Just crossed 200,000 free public AI datasets shared by the community on Hugging Face! Text, image, video, audio, time-series & many more... Thanks everyone!

http://hf.co/datasets
posted an update 5 months ago
view post
Post
1113
Made public a dataset of scraped teletype articles.

Here's the overview:
- 3.3 million articles, predominantly in Russian and English
- Includes original HTML, extracted text and metadata
- All articles were run through language identification
- Includes all public articles up until April 2024

its5Q/teletype
New activity in Vikhrmodels/Vikhr-7B-instruct_0.3 9 months ago

max_position_embeddings

4
#2 opened 9 months ago by
radm