Stephen Genusa's picture
6 12

Stephen Genusa PRO

StephenGenusa
Β·

AI & ML interests

LFM, LLM, Quantization, Vision, RAG/Hybrid/Graph, Multimodality, NLP (will take us further down the road with existing LLM tech)

Recent Activity

reacted to openfree's post with βž• about 8 hours ago
πŸš€ Llama-4 Model-Based Agentic AI System Released! πŸ”₯ Introducing the Latest Llama-4 Models Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed. Link1: https://huggingface.co/spaces/openfree/Llama-4-Maverick-17B-Research Link2: https://huggingface.co/spaces/openfree/Llama-4-Scout-17B-Research 🧠 The Innovation of Agentic AI: Deep Research Feature The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes: πŸ” Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches 🌐 Real-time Web Search: Collects the latest information through the SerpHouse API πŸ“Š Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information πŸ“ Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results ⚑ Key Advantages πŸ’― Free API Service: Stable and fast LLM service through Fireworks AI 🧩 Easy Integration: Accessible through a simple Gradio interface πŸ”„ Streaming Responses: Minimized waiting time with real-time generated responses 🌍 Multilingual Support: Automatic detection and processing of various languages including Korean πŸ› οΈ Technical Features The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries. 🀝 Community Participation For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together! Start now!
reacted to openfree's post with πŸ‘ about 8 hours ago
πŸš€ Llama-4 Model-Based Agentic AI System Released! πŸ”₯ Introducing the Latest Llama-4 Models Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed. Link1: https://huggingface.co/spaces/openfree/Llama-4-Maverick-17B-Research Link2: https://huggingface.co/spaces/openfree/Llama-4-Scout-17B-Research 🧠 The Innovation of Agentic AI: Deep Research Feature The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes: πŸ” Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches 🌐 Real-time Web Search: Collects the latest information through the SerpHouse API πŸ“Š Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information πŸ“ Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results ⚑ Key Advantages πŸ’― Free API Service: Stable and fast LLM service through Fireworks AI 🧩 Easy Integration: Accessible through a simple Gradio interface πŸ”„ Streaming Responses: Minimized waiting time with real-time generated responses 🌍 Multilingual Support: Automatic detection and processing of various languages including Korean πŸ› οΈ Technical Features The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries. 🀝 Community Participation For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together! Start now!
reacted to openfree's post with 🀯 about 8 hours ago
πŸš€ Llama-4 Model-Based Agentic AI System Released! πŸ”₯ Introducing the Latest Llama-4 Models Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed. Link1: https://huggingface.co/spaces/openfree/Llama-4-Maverick-17B-Research Link2: https://huggingface.co/spaces/openfree/Llama-4-Scout-17B-Research 🧠 The Innovation of Agentic AI: Deep Research Feature The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes: πŸ” Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches 🌐 Real-time Web Search: Collects the latest information through the SerpHouse API πŸ“Š Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information πŸ“ Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results ⚑ Key Advantages πŸ’― Free API Service: Stable and fast LLM service through Fireworks AI 🧩 Easy Integration: Accessible through a simple Gradio interface πŸ”„ Streaming Responses: Minimized waiting time with real-time generated responses 🌍 Multilingual Support: Automatic detection and processing of various languages including Korean πŸ› οΈ Technical Features The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries. 🀝 Community Participation For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together! Start now!
View all activity

Organizations

Social Post Explorers's profile picture

StephenGenusa's activity

reacted to openfree's post with βž•πŸ‘πŸ€―πŸ€—πŸ˜ŽπŸ‘€πŸš€πŸ”₯❀️ about 8 hours ago
view post
Post
1909
πŸš€ Llama-4 Model-Based Agentic AI System Released!

πŸ”₯ Introducing the Latest Llama-4 Models
Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed.

Link1: openfree/Llama-4-Maverick-17B-Research
Link2: openfree/Llama-4-Scout-17B-Research

🧠 The Innovation of Agentic AI: Deep Research Feature
The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes:

πŸ” Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches
🌐 Real-time Web Search: Collects the latest information through the SerpHouse API
πŸ“Š Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information
πŸ“ Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results

⚑ Key Advantages

πŸ’― Free API Service: Stable and fast LLM service through Fireworks AI
🧩 Easy Integration: Accessible through a simple Gradio interface
πŸ”„ Streaming Responses: Minimized waiting time with real-time generated responses
🌍 Multilingual Support: Automatic detection and processing of various languages including Korean

πŸ› οΈ Technical Features
The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries.

🀝 Community Participation
For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together!

Start now!
  • 5 replies
Β·
reacted to bartowski's post with πŸ‘ 6 days ago
view post
Post
67964
Switching to author_model-name

I posted a poll on twitter, and others have mentioned the interest in me using the convention of including the author name in the model path when I upload.

It has a couple advantages, first and foremost of course is ensuring clarity of who uploaded the original model (did Qwen upload Qwen2.6? Or did someone fine tune Qwen2.5 and named it 2.6 for fun?)

The second thing is that it avoids collisions, so if multiple people upload the same model and I try to quant them both, I would normally end up colliding and being unable to upload both

I'll be implementing the change next week, there are just two final details I'm unsure about:

First, should the files also inherit the author's name?

Second, what to do in the case that the author name + model name pushes us past the character limit?

Haven't yet decided how to handle either case, so feedback is welcome, but also just providing this as a "heads up"
Β·
reacted to tomaarsen's post with ❀️ 11 days ago
view post
Post
2206
‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think.

1️⃣ Reranker Training Refactor
Reranker models can now be trained using an extensive trainer with a lot of powerful features:
- MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP))
- bf16 training support; loss logging
- Evaluation datasets + evaluation loss
- Improved callback support + an excellent Weights & Biases integration
- Gradient checkpointing, gradient accumulation
- Model card generation
- Resuming from a training checkpoint without performance loss
- Hyperparameter Optimization
and much more!

Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker
Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade.

2️⃣ New Reranker Losses
- 11 new losses:
- 2 traditional losses: BinaryCrossEntropy and CrossEntropy
- 2 distillation losses: MSE and MarginMSE
- 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL
- 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE

3️⃣ New Reranker Documentation
- New Training Overview, Loss Overview, API Reference docs
- 5 new, 1 refactored training examples docs pages
- 13 new, 6 refactored training scripts
- Migration guides (2.x -> 3.x, 3.x -> 4.x)

4️⃣ Blogpost
Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1
reacted to vincentg64's post with πŸ”₯ 3 months ago
view post
Post
2245
LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq

In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv.

OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition.

➑️ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq
  • 1 reply
Β·