nltpt-q

community

AI & ML interests

None defined yet.

Recent Activity

nltpt-q's activity

cfahlgren1ย 
posted an update 2 months ago
view post
Post
2149
If you haven't seen yet, we just released Inference Providers ๐Ÿ”€

> 4 new serverless inference providers on the Hub ๐Ÿคฏ
> Use your HF API key or personal key with all providers ๐Ÿ”‘
> Chat with Deepseek R1, V3, and more on HF Hub ๐Ÿ‹
> We support Sambanova, TogetherAI, Replicate, and Fal.ai ๐Ÿ’ช

Best of all, we don't charge any markup on top of the provider ๐Ÿซฐ Have you tried it out yet? HF Pro accounts get $2 of free usage for the provider inference.
ariG23498ย 
posted an update 3 months ago
view post
Post
2471
Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
ariG23498ย 
posted an update 3 months ago
cfahlgren1ย 
posted an update 3 months ago
view post
Post
1764
Wow, I just added Langfuse tracing to the Deepseek Artifacts app and it's really nice ๐Ÿ”ฅ

It allows me to visualize and track more things along with the cfahlgren1/react-code-instructions dataset.

It was just added as a one click Docker Space template, so it's super easy to self host ๐Ÿ’ช
cfahlgren1ย 
posted an update 3 months ago
view post
Post
2257
You'll notice the AI in the SQL Console is much better at working with chatml conversations:

Here's example of unnesting the cfahlgren1/react-code-instructions in less than 10 seconds by asking it. Check it out here: cfahlgren1/react-code-instructions

- "show me the average assistant response length"
- "extract user, system, and assistant messages into separate columns"

It's super easy to work with conversational datasets now with natural language ๐Ÿ—ฃ๏ธ





  • 2 replies
ยท
cfahlgren1ย 
posted an update 3 months ago
reach-vbย 
posted an update 4 months ago
view post
Post
6251
VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: @jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! ๐Ÿ”ฅ
ariG23498ย 
posted an update 4 months ago
cfahlgren1ย 
posted an update 4 months ago
view post
Post
1937
You can just ask things ๐Ÿ—ฃ๏ธ

"show me messages in the coding category that are in the top 10% of reward model scores"

Download really high quality instructions from the Llama3.1 405B synthetic dataset ๐Ÿ”ฅ

argilla/magpie-ultra-v1.0

cfahlgren1ย 
posted an update 4 months ago
view post
Post
3037
We just dropped an LLM inside the SQL Console ๐Ÿคฏ

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any Hugging Face dataset โœจ

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think ๐Ÿค—
reach-vbย 
posted an update 4 months ago
view post
Post
4752
Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tรผlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar (https://huggingface.co/collections/OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! ๐Ÿค—
cfahlgren1ย 
posted an update 5 months ago
view post
Post
924
observers ๐Ÿ”ญ - automatically log all OpenAI compatible requests to a dataset๐Ÿ’ฝ

โ€ข supports any OpenAI compatible endpoint ๐Ÿ’ช
โ€ข supports DuckDB, Hugging Face Datasets, and Argilla as stores

> pip install observers

No complex framework. Just a few lines of code to start sending your traces somewhere. Let us know what you think! @davidberenstein1957 and I will continue iterating!

Here's an example dataset that was logged to Hugging Face from Ollama: cfahlgren1/llama-3.1-awesome-chatgpt-prompts
cfahlgren1ย 
posted an update 5 months ago
view post
Post
878
You can create charts, leaderboards, and filters on top of any Hugging Face dataset in less than a minute

โ€ข ASCII Bar Charts ๐Ÿ“Š
โ€ข Powered by DuckDB WASM โšก
โ€ข Download results to Parquet ๐Ÿ’ฝ
โ€ข Embed and Share results with friends ๐Ÿ“ฌ

Do you have any interesting queries?
cfahlgren1ย 
posted an update 5 months ago
cfahlgren1ย 
posted an update 5 months ago
view post
Post
3229
You can clean and format datasets entirely in the browser with a few lines of SQL.

In this post, I replicate the process @mlabonne used to clean the new microsoft/orca-agentinstruct-1M-v1 dataset.

The cleaning process consists of:
- Joining the separate splits together / add split column
- Converting string messages into list of structs
- Removing empty system prompts

https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset

Here's his new cleaned dataset: mlabonne/orca-agentinstruct-1M-v1-cleaned
  • 1 reply
ยท
reach-vbย 
posted an update 5 months ago
view post
Post
4556
What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! ๐Ÿค—