
Nerdy Face
Enterprise
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
nerdyface's activity

thomwolf
authored
a
paper
17 days ago
Post
2178
🙋🏻♂️hey there folks ,
Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math
give it a try !
Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math
give it a try !
Post
2273
Welcome to the Doge Face Open Source Community! 🚀
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable – small language models. 🔬
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! 🤗
👉 Repository name on Github: https://github.com/SmallDoges/small-doge
👉 Organization name on Hugging Face: https://huggingface.co/SmallDoge
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable – small language models. 🔬
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! 🤗
👉 Repository name on Github: https://github.com/SmallDoges/small-doge
👉 Organization name on Hugging Face: https://huggingface.co/SmallDoge
Post
2914
🙋🏻♂️ Hey there folks ,
our team made a game during the @mistral-game-jam and we're trying to win the community award !
try our game out and drop us a ❤️ like basically to vote for us !
Mistral-AI-Game-Jam/TextToSurvive
hope you like it !
our team made a game during the @mistral-game-jam and we're trying to win the community award !
try our game out and drop us a ❤️ like basically to vote for us !
Mistral-AI-Game-Jam/TextToSurvive
hope you like it !

suayptalha
posted
an
update
about 1 month ago
Post
1819
My last Falcon3-7B merge model,
suayptalha/Falcon3-Jessi-v0.4-7B-Slerp, is currently ranked #1 on the
open-llm-leaderboard/open_llm_leaderboard among all models with up to 14B parameters.
My Qwen2.5-7B merge model, suayptalha/HomerCreativeAnvita-Mix-Qw7B, is also ranked #7, placing two of my models in the top 10!
My Qwen2.5-7B merge model, suayptalha/HomerCreativeAnvita-Mix-Qw7B, is also ranked #7, placing two of my models in the top 10!
Post
1707
🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
Post
2076
Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co/collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

florentgbelidji
posted
an
update
about 1 month ago
Post
1472
𝗣𝗹𝗮𝗻𝗻𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗡𝗲𝘅𝘁 𝗦𝗸𝗶 𝗔𝗱𝘃𝗲𝗻𝘁𝘂𝗿𝗲 𝗝𝘂𝘀𝘁 𝗚𝗼𝘁 𝗦𝗺𝗮𝗿𝘁𝗲𝗿: 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁!🏔️⛷️
With the big hype around AI agents these days, I couldn’t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boring…
Passionate about outdoors, I’ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. That’s why I built 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.
Built using Hugging Face's 𝘀𝗺𝗼𝗹𝗮𝗴𝗲𝗻𝘁𝘀 library, Alpine Agent combines the power of AI with trusted resources like 𝘚𝘬𝘪𝘵𝘰𝘶𝘳.𝘧𝘳 (https://skitour.fr/) and METEO FRANCE. Whether it’s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.
In my latest blog post, I share how I developed this project—from defining tools and integrating APIs to selecting the best LLMs like 𝘘𝘸𝘦𝘯2.5-𝘊𝘰𝘥𝘦𝘳-32𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, 𝘓𝘭𝘢𝘮𝘢-3.3-70𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, or 𝘎𝘗𝘛-4.
⛷️ Curious how AI can enhance adventure planning? Try the app and share your thoughts: florentgbelidji/alpine-agent
👉 Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent
Many thanks to @m-ric for helping on building this tool with smolagents!
With the big hype around AI agents these days, I couldn’t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boring…
Passionate about outdoors, I’ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. That’s why I built 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.
Built using Hugging Face's 𝘀𝗺𝗼𝗹𝗮𝗴𝗲𝗻𝘁𝘀 library, Alpine Agent combines the power of AI with trusted resources like 𝘚𝘬𝘪𝘵𝘰𝘶𝘳.𝘧𝘳 (https://skitour.fr/) and METEO FRANCE. Whether it’s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.
In my latest blog post, I share how I developed this project—from defining tools and integrating APIs to selecting the best LLMs like 𝘘𝘸𝘦𝘯2.5-𝘊𝘰𝘥𝘦𝘳-32𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, 𝘓𝘭𝘢𝘮𝘢-3.3-70𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, or 𝘎𝘗𝘛-4.
⛷️ Curious how AI can enhance adventure planning? Try the app and share your thoughts: florentgbelidji/alpine-agent
👉 Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent
Many thanks to @m-ric for helping on building this tool with smolagents!
Post
1866
🙋🏻♂️ Hey there folks ,
Facebook AI just released JASCO models that make music stems .
you can try it out here : Tonic/audiocraft
hope you like it
Facebook AI just released JASCO models that make music stems .
you can try it out here : Tonic/audiocraft
hope you like it

thomwolf
authored
a
paper
about 1 month ago
Post
2444
🙋🏻♂️Hey there folks , Open LLM Europe just released Lucie 7B-Instruct model , a billingual instruct model trained on open data ! You can check out my unofficial demo here while we wait for the official inference api from the group :
Tonic/Lucie-7B hope you like it 🚀

jeffboudier
posted
an
update
about 2 months ago
Post
643
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub:
nvidia/cosmos-6751e884dc10e013a0a0d8e6
Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6
Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6
Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos

suayptalha
posted
an
update
about 2 months ago
Post
2137
🚀 Introducing 𝐅𝐢𝐫𝐬𝐭 𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞 𝐈𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐦𝐢𝐧𝐆𝐑𝐔 𝐌𝐨𝐝𝐞𝐥𝐬 from the paper 𝐖𝐞𝐫𝐞 𝐑𝐍𝐍𝐬 𝐀𝐥𝐥 𝐖𝐞 𝐍𝐞𝐞𝐝𝐞𝐝?
🖥 I have integrated 𝐧𝐞𝐱𝐭-𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐑𝐍𝐍𝐬, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬" 𝐥𝐢𝐛𝐫𝐚𝐫𝐲 for both usage and training.
💻 I integrated two main tasks: 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 and 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌.
𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧:
You can use this class for 𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.
𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌:
You can use this class for 𝐂𝐚𝐮𝐬𝐚𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥 tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!
🔗 𝐋𝐢𝐧𝐤𝐬:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1
📰 𝐂𝐫𝐞𝐝𝐢𝐭𝐬:
Paper Link: https://arxiv.org/abs/2410.01201
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.
🖥 I have integrated 𝐧𝐞𝐱𝐭-𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐑𝐍𝐍𝐬, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬" 𝐥𝐢𝐛𝐫𝐚𝐫𝐲 for both usage and training.
💻 I integrated two main tasks: 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 and 𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌.
𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧:
You can use this class for 𝐒𝐞𝐪𝐮𝐞𝐧𝐜𝐞 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.
𝐌𝐢𝐧𝐆𝐑𝐔𝐅𝐨𝐫𝐂𝐚𝐮𝐬𝐚𝐥𝐋𝐌:
You can use this class for 𝐂𝐚𝐮𝐬𝐚𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥 tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!
🔗 𝐋𝐢𝐧𝐤𝐬:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1
📰 𝐂𝐫𝐞𝐝𝐢𝐭𝐬:
Paper Link: https://arxiv.org/abs/2410.01201
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.

suayptalha
posted
an
update
2 months ago
Post
2495
🚀 Introducing Substitution Cipher Solvers!
As @suayptalha and @Synd209 we are thrilled to share an update!
🔑 This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI
As @suayptalha and @Synd209 we are thrilled to share an update!
🔑 This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI

suayptalha
posted
an
update
2 months ago
Post
1635
🚀 FastLlama Series is Live!
🦾 Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
🤖 Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.
💻 Its compact size makes it versatile for a wide range of applications!
💬 Chat with the model:
🔗 Chat Link: suayptalha/Chat-with-FastLlama
🔗 Model Link: suayptalha/FastLlama-3.2-1B-Instruct
🦾 Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
🤖 Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.
💻 Its compact size makes it versatile for a wide range of applications!
💬 Chat with the model:
🔗 Chat Link: suayptalha/Chat-with-FastLlama
🔗 Model Link: suayptalha/FastLlama-3.2-1B-Instruct
Post
5709
We are proud to announce
HuggingFaceFW/fineweb-2: A sparkling update to
HuggingFaceFW/fineweb with 1000s of 🗣️languages.
We applied the same data-driven approach that led to SOTA English performance in🍷 FineWeb to thousands of languages.
🥂 FineWeb2 has 8TB of compressed text data and outperforms other multilingual datasets in our experiments.
The dataset is released under the permissive 📜 ODC-By 1.0 license, and the 💻 code to reproduce it and our evaluations is public.
We will very soon announce a big community project, and are working on a 📝 blogpost walking you through the entire dataset creation process. Stay tuned!
In the mean time come ask us question on our chat place: HuggingFaceFW/discussion
H/t @guipenedo @hynky @lvwerra as well as @vsabolcec Bettina Messmer @negar-foroutan and @mjaggi
We applied the same data-driven approach that led to SOTA English performance in🍷 FineWeb to thousands of languages.
🥂 FineWeb2 has 8TB of compressed text data and outperforms other multilingual datasets in our experiments.
The dataset is released under the permissive 📜 ODC-By 1.0 license, and the 💻 code to reproduce it and our evaluations is public.
We will very soon announce a big community project, and are working on a 📝 blogpost walking you through the entire dataset creation process. Stay tuned!
In the mean time come ask us question on our chat place: HuggingFaceFW/discussion
H/t @guipenedo @hynky @lvwerra as well as @vsabolcec Bettina Messmer @negar-foroutan and @mjaggi
Post
1573
Exponentially growing number of open-source AI models over the course of the past 30 months – from a few thousands to over 1 million and more
Interactive data viz: huggingface/open-source-ai-year-in-review-2024
Interactive data viz: huggingface/open-source-ai-year-in-review-2024
Post
1763