Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

danielhanchenย 
posted an update 3 days ago
view post
Post
8402
Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.
You can run and train the model via Unsloth Studio.

GGUF: unsloth/gemma-4-12b-it-GGUF
Guide: https://unsloth.ai/docs/models/gemma-4
  • 3 replies
ยท
AxionLab-officialย 
posted an update about 16 hours ago
view post
Post
3833
THIS IS CRAZY! THE MODEL ON THE IMAGE(Supra-50M-Reasoning) answered correctly and its QUANTIZED IN 2BIT! THE RESPONSE IS CORRECT, IN A 15MB SIZE FILE!
  • 8 replies
ยท
AxionLab-officialย 
posted an update 3 days ago
pankajpandey-devย 
posted an update about 7 hours ago
view post
Post
34
๐Ÿ‡ฎ๐Ÿ‡ณ Gemma-3-1B Hindi Instruct โ€” a Hindi LLM that runs fully offline, anywhere.
Last week I shipped Qwen3-4B Hindi. This week I went the other direction: how tiny can a useful Hindi model get? So I fine-tuned Gemma-3-1B on quality-filtered Hindi instruction data and shipped the full GGUF ladder.
โœ… Fine-tune (16-bit): pankajpandey-dev/gemma-3-1b-hindi-instruct
โœ… GGUF (Q4/Q5/Q8): pankajpandey-dev/gemma-3-1b-hindi-instruct-GGUF
Runs in Ollama, llama.cpp, and LM Studio. The Q4_K_M is just 806 MB โ€” runs on CPU, a cheap laptop, even a Raspberry Pi.
What I tried this round: chrF-filtered the training data to drop weak translations, and used response-only loss so the model learns how to answer, not how to repeat prompts.
Honest note: at 1B, Hindi fluency is strong but coherence is bounded by size โ€” it's a lightweight/edge experiment, not a 4B replacement. Gemma-3-4B Hindi is next.
Part of my Hindi LLM Series โ€” openly-licensed Indic models for local & edge use. Feedback welcome ๐Ÿ™
#Hindi #IndicNLP #GGUF #LocalLLM #Gemma #EdgeAI
kanaria007ย 
posted an update about 11 hours ago
view post
Post
44
โœ… Article highlight: *Interop Schemas for Learning-World Governance Artifacts* (art-60-175, v0.1)

TL;DR:
This article argues that governance without interop is vendor-local theater.

It is not enough for one system to say *โ€œwe have receipts.โ€* If another vendor cannot parse the artifact, reproduce the digest, replay the bundle, and reach the same admissibility outcome, the claim is not really portable. So 175 defines a common interop layer: shared envelopes, pinned canonicalization, minimal portable schemas, and deterministic bundle formats.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
โ€ข turns governance artifacts into cross-vendor verifiable objects rather than local implementation details
โ€ข fixes the classic failure modes of digest drift, schema drift, and bundle drift
โ€ข makes โ€œsame artifact / same verdictโ€ a testable claim instead of a handshake promise
โ€ข gives courts, forgetting flows, and unlearning claims portable bundle formats

Whatโ€™s inside:
โ€ข a common *interop envelope* for contracts, manifests, receipts, and bundles
โ€ข a pinned *canonicalization profile* plus conformance receipts to stop digest disagreements
โ€ข minimal portable schemas for core learning-world governance artifacts
โ€ข deterministic bundle formats like *Court ZIP*, *Forgetting ZIP*, and *Unlearning ZIP*
โ€ข replay/conformance receipts so another vendor can verify the same bundle and reach the same admissibility result

Key idea:
Do not say:

*โ€œour system can export the evidence.โ€*

Say:

*โ€œthis artifact uses this schema registry, this canonicalization profile, this interop-safe digest model, and this bundle indexโ€”so another vendor can verify the same object and reach the same result.โ€*

That is how governance stops being local theater and becomes portable infrastructure.
RiverRiderย 
posted an update 2 days ago
view post
Post
116
Words do not have determined meanings.

The vocabulary itself is reflexive. It is self-referential, looping back into its own structure rather than anchoring in fixed reality. What we treat as stable meaning is continually reconstituted in the act of using it. The observers own interpretations molding each word like clay with every utterance.ย 

All large language models to date treat words otherwise. At the moment of softmax crystallization they determine the meaning of every token. Probabilities collapse into a single output. Meaning is not found. It is fixed, token by token, in that final distribution.

SRT-Introspect is a demo for observing what Qwen actually thinks at the points of highest effort. It surfaces the internal representations during generation, making visible the reflexive vocabulary at work and the precise crystallization process: the weights, the assumptions, the decisions that resolve ambiguity into output. This includes accounting for anisotropy collapse in hidden states by centering representations around the layer-mean before analysis.

Feel free to comment your prompts

RiverRider/srt-introspect

Repo
https://github.com/space-bacon/SRT
  • 1 reply
ยท
hypotheticalย 
posted an update 4 days ago
ovi054ย 
posted an update 6 days ago
view post
Post
220
Color Grading Transferโšก

ovi054/Color-Grade-Transfer-Qwen-Image

What if you could steal color grade from your favorite films or any still image and apply it to your own content. And no, you don't need to be a professional colorist.

Input 1: Source Image - Content to be preserve
Input 2: Reference Image - Any still from films
Output: Color graded output image

๐Ÿ‘‰ Try it now: ovi054/Color-Grade-Transfer-Qwen-Image
barakorย 
posted an update 7 days ago
view post
Post
118
Hi everyone,
I just published a new blog post from State16 about runtime integrity for Physical AI systems. The core question we explore is simple:
Can a predicted trajectory physically exist before it reaches the controller?

In many autonomous systems, a model can output a trajectory with a very high confidence score, but that trajectory may still violate basic physical constraints such as motion feasibility, velocity, acceleration, continuity, or interaction with the environment.

In our first paper and validation work, we tested this idea on LeRobot Push from Hugging Face. The goal is to detect physically inadmissible AI-generated behaviors before execution, especially in robotics and autonomous systems where a โ€œconfidentโ€ prediction is not enough.

Would be glad to get thoughts, feedback, and suggestions from the community.

Read here: https://huggingface.co/blog/barakor/can-predicted-dynamics-exist-in-the-physical-world


STATE16
TravisMuhlesteinย 
posted an update 8 days ago
view post
Post
74
We have model cards. We donโ€™t yet have capability manifests. Thatโ€™s the gap DNS-AID points toward.

The Linux Foundation just launched DNS-AID: open, decentralized discovery infrastructure for AI agents.

๐Ÿ”— https://www.linuxfoundation.org/press/linux-foundation-announces-dns-aid-project-to-advance-decentralized-ai-agent-discovery

Most agent frameworks today still assume agents already know where other agents and tools exist. That assumption starts breaking down in cross-platform and cross-organization workflows.

My hypothesis: agent ecosystems eventually need standardized schemas describing not just what model an agent runs, but what it can actually do โ€” tool interfaces, invocation patterns, input/output contracts, trust metadata, operational constraints, etc. Something orchestrators and other agents can discover and reason about dynamically without hardcoded integrations.

Feels like an area the open-source ecosystem could meaningfully shape early before proprietary registries and platform lock-in dominate the space.

Curious if others here are already working on interoperability, discovery, capability schemas, or agent routing layers. Would love to compare notes.