12 3 236

Samuel L Meyers

MrOvkill

AI & ML interests

Dialogue Generation, Text Generation, etc...

Recent Activity

liked a Space about 1 month ago

not-lain/background-removal

reacted to marksverdhei's post with 👍 about 1 month ago

Poll: Will 2026 be the year of subquadratic attention? The transformer architecture is cursed by its computational complexity. It is why you run out of tokens and have to compact. But some would argue that this is a feature not a bug and that this is also why these models are so good. We've been doing a lot of research on trying to make equally good models that are computationally cheaper, But so far, none of the approaches have stood the test of time. Or so it seems. Please vote, don't be shy. Remember that the Dunning-Kruger effect is very real, so the person who knows less about transformers than you is going to vote. We want everyone's opinion, no matter confidence. 👍 if you think at least one frontier model* will have no O(n^2) attention by the end of 2026 🔥 If you disagree * Frontier models - models that match / outperform the flagship claude, gemini or chatgpt at the time on multiple popular benchmarks

liked a model about 1 month ago

openbmb/MiniCPM-o-4_5

View all activity

Organizations

reactedto marksverdhei's post with 👍 about 1 month ago

Post

4578

Poll: Will 2026 be the year of subquadratic attention?

The transformer architecture is cursed by its computational complexity.
It is why you run out of tokens and have to compact. But some would argue that this is a feature not a bug and that this is also why these models are so good. We've been doing a lot of research on trying to make equally good models that are computationally cheaper, But so far, none of the approaches have stood the test of time. Or so it seems.

Please vote, don't be shy. Remember that the Dunning-Kruger effect is very real, so the person who knows less about transformers than you is going to vote. We want everyone's opinion, no matter confidence.

👍 if you think at least one frontier model* will have no O(n^2) attention by the end of 2026
🔥 If you disagree

* Frontier models - models that match / outperform the flagship claude, gemini or chatgpt at the time on multiple popular benchmarks

4 replies

repliedto ProCreations's post 11 months ago

Well, personally i'm deeply torn on the subject.
On one hand, you've got Google claiming their liquid nitrogen baby can finally perform, "stable enough", Quantum Error Correction to begin working on actual math problems and taking advantage of the immense cross-connectivity and data bandwidth. And they're absolutely correct that if the QEC is good enough it can obviously enhance the speed of the LLM.

On the other hand... I absolutely adore the fact that these tools are SO open and SO portable that I can quite literally create the entire AI model myself from scratch on my desktop if I want to and have lots of time. My biggest concern, is that very few, if any, private citizens are going to be capable of maintaining a liquid-nitrogen-cooled-quantum-mainframe in their basement. It's not just the stereotype, "nerd virtual girlfriend", type uses i'm concerned about either. How many datasets on this very website would have been utterly impossible if everyone had to queue up for supercomputer time every. Single. Training. Loop.

So naturally i'm highly concerned, as should we all be, that relying on quantum computing for anything other than the most onerous and resource-hog-like will end up dooming hobbyist AI and cause us quite a few problems down the road when corporations realize the sheer scope of psychological warfare they can inflict on their customers at will to make them more profitable and helpless.

TL;DR, Quantum AI has lots of potential for good and bad, but we all, as a open source-first community must focus on what we can improve, maintain, and sustain with our own equipment first.

reactedto ProCreations's post with 👀 11 months ago

Post

1386

Quantum Computing + AI = 🤯?
What do you think quantum computing will do to AI?
Will it revolutionize training speed? Unlock whole new algorithms? Or maybe… just complicate things?

💬 Drop your thoughts below — we’ll share our take and highlight some of your replies in tomorrow’s post!

3 replies

posted an update 11 months ago

Post

494

Hello!

Got permission to make a quick announcement for

DigitalClockwork , and we're happy to say that easy granular GGUF quantization within Colab for GGUF is now easy-peasy!

And, no, we did not invent magic ( yet. Wait til we get us some time and funding... ) nor did we create a super-model-glue. It still needs to be a model llama.cpp supports.

https://colab.research.google.com/drive/1s60fNyiaLckl0ZscAC0vhnW7KZU0yn5n?usp=sharing

repliedto their post about 1 year ago

Good question!
Before, you had to download the .wav file from Colab, I have now added an Audio display from IPython, and will be cleaning up for a future post. Sorry for crap 1st release, next will be much better. I work in a copy as to not disturb the original usually.

posted an update about 1 year ago

Post

2358

Hello!

I was just playing around with Python's MIDI library and Colab's code generation, accidentally cooked up a quick n' dirty audio synthesis template.
Have fun!

https://colab.research.google.com/drive/1d-AF6jygCwmoJvAa9nnEMe5ROidnMJNY?usp=sharing

-<3

3 replies

reactedto samihalawa's post with 🔥 about 1 year ago

Post

1992

🚀 OpenAI o3-mini Just Dropped – Here’s What You Need to Know!

OpenAI just launched o3-mini, a faster, smarter upgrade over o1-mini. It’s better at math, coding, and logic, making it more reliable for structured tasks. Now available in ChatGPT & API, with function calling, structured outputs, and system messages.

🔥 Why does this matter?
✅ Stronger in logic, coding, and structured reasoning
✅ Function calling now works reliably for API responses
✅ More stable & efficient for production tasks
✅ Faster responses with better accuracy

⚠️ Who should use it?
✔️ Great for coding, API calls, and structured Q&A
❌ Not meant for long conversations or complex reasoning (GPT-4 is better)

💡 Free users: Try it under “Reason” mode in ChatGPT
💡 Plus/Team users: Daily message limit tripled to 150/day!

2 replies

reactedto MoritzLaurer's post with ❤️ about 1 year ago

Post

1780

The TRL v0.13 release is 🔥! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning. Perfect for tasks like stepwise reasoning.

🔀 Model merging: A new callback leverages mergekit to merge models during training, improving performance by blending reference and policy models - optionally pushing merged models to the Hugging Face Hub.

🛠️ Tool call support: TRL preprocessing now supports tool integration, laying the groundwork for agent fine-tuning with examples like dynamic temperature fetching in prompts.

⚖️ Mixture of judges: The new AllTrueJudge combines decisions from multiple binary judges for more nuanced evaluation.

Read the release notes and other resources here 👇
Release: https://github.com/huggingface/trl/releases/tag/v0.13.0
Mergekit: https://github.com/arcee-ai/mergekit
Mixture of judges paper: The Perfect Blend: Redefining RLHF with Mixture of Judges (2409.20370)

repliedto mlabonne's post over 1 year ago

I am personally of the opinion that it is likely that the larger models have intentionally, especially technically proficient models like Claude or 4o have been intentionally 'broken' from storytelling, as they have become much more helpful and critical in their role as co-engineers. I have personally conscripted Claude for some testing, and it's given me about 1/3 of an AI model that I basically only had to design and fix instead of consider every detail without knowing the interactions. This lack of hallucination and skill for deterministic writing likely detracts from any creative elements present. Picture a highly autistic person with a savant for programming and logic. This person would be a genius at code, but likely poor at creative writing unless instructed. The same would be true of a synthetic mind given only factual and grounded data for much of it's training, as Anthropic seems to be doing for ( obvious ) safety reasons.

reactedto mlabonne's post with 🤗 over 1 year ago

Post

19711

Large models are surprisingly bad storytellers.

I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."

Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.

In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.

Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.

I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.

Do you know why smaller models outperform the frontier models here?

44 replies

posted an update over 1 year ago

Post

1230

Hello!

I've been in the lab synthesizing captions, with my trusty sidekick Blip, and along the way I had an interesting idea. I thought of designing an incredibly simple model that accepts simple instruction pairs, adjective noun pairs specifically, and outputs 2d vertices.

The current implementation has been implemented by myself then ran over with Claude, not because I am incompetent, but because I recognize tools written by experts may have more technique than my newbie self.

As with all projects, this will be updated with proportion to the feedback received, if someone's using it and wants to keep using it, i'm happy to keep working on anything. Thanks, all! 🤗

-<3

https://colab.research.google.com/gist/SMeyersMrOvkill/8d4686db803f6c5f43fafc1c94b1c8c6/polypathdelement.ipynb

posted an update over 1 year ago

Post

2130

Hello!

I've been in the lab, I think one or two of you saw my furtive attempts to create a dolphinized 2b Gemma, which is still waiting for more funding. I get paid in a week.

Once that funding ran out, I dropped my last pinch of API credits to work on this:

DigitalClockwork/spatial_instruct_v1

It is an instruct database for spatial interactions with color tokens, i'm planning to tune a TBD model. Been experimenting with Gemma, but i'm welcome to ( smaller! ) model suggestions. If you think your favorite 0.5/0.75/1/2b can handle numbers, distances, or colors especially well, most especially community-enhanced models... I'm listening to the comments, intently!
Have a great day, and enjoy! This was one fun! 🤗

-<3

reactedto not-lain's post with ❤️ over 1 year ago

Post

2738

I have finished writing a blogpost about building an image-based retrieval system, This is one of the first-ever approaches to building such a pipeline using only open-source models/libraries 🤗

You can checkout the blogpost in https://huggingface.co/blog/not-lain/image-retriever and the associated space at not-lain/image-retriever .

✨ If you want to request another blog post consider letting me know down below or you can reach out to me through any of my social media

📖 Happy reading !

repliedto their post over 1 year ago

You aren't the one flaming. The others though...

Anyway yes, it's being improved now. Been in the lab since that post. The CO-lab...

repliedto their post over 1 year ago

As did 'takera' author your thoughts, apparently. You're like snowflakes, each of you.

repliedto their post over 1 year ago

I was testing some plugins, it didn't occur to me the default installations of some of the most commonly used plugins would cause issues. I apologize for the horrifying inconvenience that you may have suffered at the hands of my blog. It does, after all, have such large and pointy teeth. Oh. Wait...

posted an update over 1 year ago

Post

850

Hello!

I've been playing with Claude, and we decided to tackle a real thorn in my side.

"The Truthiness Model" - Analyze arbitrary input text for "truthiness", or likelihood of containing true information according to seed text.

P.S. Yes, v1 was broken. I saw the loss rate going down and go excited. Anyway, it just needed some data and a rollback, me and Claude got WAY too carried away trying to tack on features.

Anyway, fixed now, and working! :D

http://samuelmeyerscode.serveblog.net/?p=49

8 replies

repliedto their post over 1 year ago

I'm so glad the data proved helpful! Keep me updated, i'm already a follower, looking forward to seeing more! As always, as if you need anything.

repliedto their post over 1 year ago

You got through to me again:
https://huggingface.co/posts/MrOvkill/139983484226395

posted an update over 1 year ago

Post

649

Hello!

https://www.youtube.com/watch?v=6NyDkpfNfUs

I had some feedback recently, that perhaps it would be beneficial to expand upon the fallacy dataset. I took this deeply to heart, and exploded it 10x.

MrOvkill/fallacies-fallacy-base

Produced synthetically with *ALL* the Gemini models on Vertex AI.

*phew* This was a rush. I can promise over 8 it might have been like 16 of straight prompt/copy/paste/fix/re-splice/fix/prompt again/chug caffeine/repeat, but we got there! Thanks for egging me on, all! I appreciate being driven to work! So much better than boredom! 🤗

Have fun!

Samuel L Meyers

AI & ML interests

Recent Activity

Organizations

MrOvkill's activity