HFforLegal (Hugging Face for Legal)

AdinaY

submitted a paper to Daily Papers about 21 hours ago

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Paper • 2603.05438 • Published 4 days ago • 29

umarbutler

posted an update 4 days ago

Post

1887

This awesome visualization by @abdurrahmanbutler tracks how reliant the High Court of Australia has been on UK precedents over time.

Back in the early 1900s, up to 70% of citations in High Court decisions were from the UK. Today, that number sits around 20%.

This change seems to have happened gradually as Australia gained more and more independence from the UK, culminating in the Australia Acts of 1986, where we see a nice bump in the proportion of Australian cases cited.

These insights would not be possible without our latest legal AI model, Kanon 2 Enricher, which we used to extract dates and citations from High Court decisions in isaacus/open-australian-legal-corpus and categorize citations by jurisdiction. You can learn about Kanon 2 Enricher here: https://isaacus.com/blog/kanon-2-enricher.

umarbutler

authored a paper 7 days ago

Legal RAG Bench: an end-to-end benchmark for legal RAG

Paper • 2603.01710 • Published 8 days ago • 7

umarbutler

submitted a paper to Daily Papers 7 days ago

Legal RAG Bench: an end-to-end benchmark for legal RAG

Paper • 2603.01710 • Published 8 days ago • 7

Tonic

posted an update 17 days ago

Post

3158

🤔 Who would win ?

- a fully subsidized ai lab
OR
- 3 random students named

kurakurai ?

demo : Tonic/fr-on-device

if you like it give the demo a little star and send a shoutout to : @MaxLSB @jddqd and @GAD-cell for absolutely obliterating the pareto frontier of the french language understanding .

4 replies

·

umarbutler

posted an update 18 days ago

Post

2183

@abdurrahmanbutler and I just dropped Legal RAG Bench, the first benchmark for legal RAG systems to simultaneously evaluate hallucinations, retrieval failures, and reasoning errors.

Our key takeaways are:
1. Embedding models, not generative models, are the primary driver of RAG accuracy. Switching from a general-purpose embedder like OpenAI's Text Embedding 3 Large to a legal domain embedder like Isaacus' Kanon 2 Embedder can raise accuracy by ~19 points.
2. Hallucinations are often triggered by retrieval failures. Fix your retrieval stack, and, in most cases, you end up fixing hallucinations.
3. Once you have a solid legal retrieval engine like Kanon 2 Embedder, it doesn’t matter as much what generative model you use; GPT-5.2 and Gemini 3.1 Pro perform relatively similarly, with Gemini 3.1 Pro achieving slightly better accuracy at the cost of more hallucinations.
4. Google's latest LLM, Gemini 3.1 Pro, is actually a bit worse than its predecessor at legal RAG, achieving 79.3% accuracy instead of 80.3%.

These findings confirm what we already knew at Isaacus: that information retrieval sets the ceiling on the accuracy of legal RAG systems. It doesn’t matter how smart you are; you aren’t going to magically know what the penalty is for speeding in California without access to an up-to-date copy of the California Vehicle Code.

Even still, to our knowledge, we’re the first to actually show this empirically.

Unfortunately, as we highlight in our write-up, high-quality open legal benchmarks like Legal RAG Bench and our earlier MLEB are few and far between.

In the interests of transparency, we have not only detailed exactly how we built Legal RAG Bench, but we’ve also released all of our data openly on Hugging Face. You can read our write up [here](https://isaacus.com/blog/legal-rag-bench), noting that we’ll soon be publishing it as a paper.

Kudos to my brother @abdurrahmanbutler for serving as the lead author on this monumental release.

2 replies

·

Tonic

posted an update 21 days ago

Post

3219

🙋🏻‍♂️hello my lovelies ,

it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment.

repo : Tonic/hugging-claw (use git clone to inspect)
literally the one-click link : Tonic/hugging-claw

you can also run it locally and see for yourself :

docker run -it -p 7860:7860 --platform=linux/amd64 \
-e HF_TOKEN="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \
-e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \
registry.hf.space/tonic-hugging-claw:latest

just a few quite minor details i'll take care of but i wanted to share here first

2 replies

·

AdinaY

posted an update 25 days ago

Post

3248

MiniMax M2.5 is now available on the hub 🚀

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS

2 replies

·

AdinaY

posted an update 26 days ago

Post

640

RynnBrain 🤖 a physics aware embodied brain for robots from Alibaba DAMO

https://huggingface.co/collections/Alibaba-DAMO-Academy/rynnbrain

✨ 2B/8B/30B (3B active)
✨ Apache 2.0
✨ Understands egocentric scenes with strong spatial awareness
✨ Tracks objects and motion over time

2 replies

·

umarbutler

posted an update 26 days ago

Post

5073

What happens when you annotate, extract, and disambiguate every entity mentioned in the longest U.S. Supreme Court decision in history? What if you then linked those entities to each other and visualized it as a network?

This is the result of enriching all 241 pages and 111,267 words of Dred Scott v. Sandford (1857) with Kanon 2 Enricher in less than ten seconds at the cost of 47 cents.

Dred Scott v. Sandford is the longest U.S. Supreme Court decision by far, and has variously been called "the worst Supreme Court decision ever" and "the Court's greatest self-inflicted wound" due to its denial of the rights of African Americans.

Thanks to Kanon 2 Enricher, we now also know that the case contains 950 numbered paragraphs, 6 footnotes, 178 people mentioned 1,340 times, 99 locations mentioned 1,294 times, and 298 external documents referenced 940 times.

For an American case, there are a decent number of references to British precedents (27 to be exact), including the Magna Carta (¶ 928).

Surprisingly though, the Magna Carta is not the oldest citation referenced. That would be the Institutes of Justinian (¶ 315), dated around 533 CE.

The oldest city mentioned is Rome (founded 753 BCE) (¶ 311), the oldest person is Justinian (born 527 CE) (¶ 314), and the oldest year referenced is 1371, when 'Charles V of France exempted all the inhabitants of Paris from serfdom' (¶ 370).

All this information and more was extracted in 9 seconds. That's how powerful Kanon 2 Enricher, my latest LLM for document enrichment and hierarchical graphitization, is. If you'd like to play with it yourself now that it's available in closed beta, you can apply to the Isaacus Beta Program here: https://isaacus.com/beta.

AdinaY

posted an update 26 days ago

Post

3620

Game on 🎮🚀

While Seedance 2.0’s videos are all over the timeline, DeepSeek quietly pushed a new model update in its app.

GLM-5 from Z.ai adds more momentum.

Ming-flash-omni from Ant Group , MiniCPM-SALA from OpenBMB
, and the upcoming MiniMax M2.5 keep the heat on 🔥

Spring Festival is around the corner,
no one’s sleeping!

✨ More releases coming, stay tuned
https://huggingface.co/collections/zh-ai-community/2026-february-china-open-source-highlights

AdinaY

posted an update 27 days ago

Post

3886

Ming-flash-omni 2.0 🚀 New open omni-MLLM released by Ant Group

inclusionAI/Ming-flash-omni-2.0

✨ MIT license
✨ MoE - 100B/6B active
✨ Zero-shot voice cloning + controllable audio
✨ Fine-grained visual knowledge grounding

2 replies

·

1aurent

authored a paper 27 days ago

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 55

alokabhishek

submitted a paper to Daily Papers 27 days ago

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

Paper • 2601.21235 • Published Jan 29 • 2

AdinaY

posted an update 28 days ago

Post

741

LLaDA 2.1 is out 🔥 A new series of MoE diffusion language model released by AntGroup

inclusionAI/LLaDA2.1-mini
inclusionAI/LLaDA2.1-flash

✨LLaDA2.1-mini: 16B - Apache2.0
✨LLaDA2.1-flash: 100B - Apache2.0
✨Both delivers editable generation, RL-trained diffusion reasoning and fast inference

2 replies

·

alokabhishek

authored a paper about 1 month ago

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

Paper • 2601.21235 • Published Jan 29 • 2

AdinaY

posted an update about 1 month ago

Post

2577

AI for science is moving fast🚀

Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab

internlm/Intern-S1-Pro

✨ 1T total / 22B active
✨ Apache 2.0
✨ SoTA scientific reasoning performance
✨ FoPE enables scalable modeling of long physical time series (10⁰–10⁶)

2 replies

·

AdinaY

posted an update about 1 month ago

Post

1374

✨ China’s open source AI ecosystem has entered a new phase

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3

One year after the “DeepSeek Moment,” open source has become the default. Models, research, infrastructure, and deployment are increasingly shared to support large-scale, system-level integration.

This final blog examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source.

AdinaY

posted an update about 1 month ago

Post

391

GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)

AdinaY

posted an update about 1 month ago

Post

1605

Step 3.5 Flash 🔥 new foundation model from StepFun ai

https://huggingface.co/collections/stepfun-ai/step-35-flash

✨ Sparse MoE：196B/11B active
✨ Supports up to 256K context
✨ Multi-token prediction for fast decoding (100–300 tok/s)
✨ Runs locally on consumer hardware

Hugging Face for Legal

AI & ML interests

Recent Activity

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Legal RAG Bench: an end-to-end benchmark for legal RAG

Legal RAG Bench: an end-to-end benchmark for legal RAG

Ministral 3

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

AI & ML interests

Recent Activity

Team members 95

HFforLegal's activity