DeepLearning AI courses

AI & ML interests

None defined yet.

Recent Activity

dlaicourses's activity

multimodalartย 
posted an update 5 months ago
multimodalartย 
posted an update 7 months ago
view post
Post
25176
The first open Stable Diffusion 3-like architecture model is JUST out ๐Ÿ’ฃ - but it is not SD3! ๐Ÿค”

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model ๐Ÿ–ผ๏ธโœจ, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english ๐Ÿค chinese understanding

Try it out by yourself here โ–ถ๏ธ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!
philschmidย 
posted an update 9 months ago
view post
Post
6888
New state-of-the-art open LLM! ๐Ÿš€ Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. ๐Ÿคฏ

TL;DR
๐Ÿงฎ 132B MoE with 16 experts with 4 active in generation
๐ŸชŸ 32 000 context window
๐Ÿ“ˆ Outperforms open LLMs on common benchmarks, including MMLU
๐Ÿš€ Up to 2x faster inference than Llama 2 70B
๐Ÿ’ป Trained on 12T tokens
๐Ÿ”ก Uses the GPT-4 tokenizer
๐Ÿ“œ Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! ๐Ÿค—
ยท
multimodalartย 
posted an update 10 months ago
view post
Post
The Stable Diffusion 3 research paper broken down, including some overlooked details! ๐Ÿ“

Model
๐Ÿ“ 2 base model variants mentioned: 2B and 8B sizes

๐Ÿ“ New architecture in all abstraction levels:
- ๐Ÿ”ฝ UNet; โฌ†๏ธ Multimodal Diffusion Transformer, bye cross attention ๐Ÿ‘‹
- ๐Ÿ†• Rectified flows for the diffusion process
- ๐Ÿงฉ Still a Latent Diffusion Model

๐Ÿ“„ 3 text-encoders: 2 CLIPs, one T5-XXL; plug-and-play: removing the larger one maintains competitiveness

๐Ÿ—ƒ๏ธ Dataset was deduplicated with SSCD which helped with memorization (no more details about the dataset tho)

Variants
๐Ÿ” A DPO fine-tuned model showed great improvement in prompt understanding and aesthetics
โœ๏ธ An Instruct Edit 2B model was trained, and learned how to do text-replacement

Results
โœ… State of the art in automated evals for composition and prompt understanding
โœ… Best win rate in human preference evaluation for prompt understanding, aesthetics and typography (missing some details on how many participants and the design of the experiment)

Paper: https://stabilityai-public-packages.s3.us-west-2.amazonaws.com/Stable+Diffusion+3+Paper.pdf
ยท
multimodalartย 
posted an update 10 months ago
multimodalartย 
posted an update 11 months ago
view post
Post
It seems February started with a fully open source AI renaissance ๐ŸŒŸ

Models released with fully open dataset, training code, weights โœ…

LLM - allenai/olmo-suite-65aeaae8fe5b6b2122b46778 ๐Ÿง 
Embedding - nomic-ai/nomic-embed-text-v1 ๐Ÿ“š (sota!)

And it's literally February 1st - can't wait to see what else the community will bring ๐Ÿ‘€
philschmidย 
posted an update 11 months ago
view post
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! ๐Ÿ‘€ย I am excited to share โ€œHow to Fine-Tune LLMs in 2024 with Hugging Faceโ€ using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. ๐Ÿš€

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
๐Ÿ’กDefine and understand use cases for fine-tuning
๐Ÿง‘๐Ÿปโ€๐Ÿ’ปย Setup of the development environment
๐Ÿงฎย Create and prepare dataset (OpenAI format)
๐Ÿ‹๏ธโ€โ™€๏ธย Fine-tune LLM using TRL and the SFTTrainer
๐Ÿฅ‡ย Test and evaluate the LLM
๐Ÿš€ย Deploy for production with TGI

๐Ÿ‘‰ย  https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. ๐Ÿ”œ
ยท