view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 23 days ago β’ 93
view article Article Ο0 and Ο0-FAST: Vision-Language-Action Models for General Robot Control Feb 4 β’ 111
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases β’ 5 items β’ Updated Dec 6, 2024 β’ 718
Idefics2 πΆ Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. β’ 11 items β’ Updated May 6, 2024 β’ 91
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper β’ 2403.09611 β’ Published Mar 14, 2024 β’ 126
StarCoder 2 and The Stack v2: The Next Generation Paper β’ 2402.19173 β’ Published Feb 29, 2024 β’ 138
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper β’ 2401.04577 β’ Published Jan 9, 2024 β’ 43
MAGNeT Collection Masked Audio Generation using a Single Non-Autoregressive Transformer β’ 9 items β’ Updated Apr 4, 2024 β’ 40
QuIP: 2-Bit Quantization of Large Language Models With Guarantees Paper β’ 2307.13304 β’ Published Jul 25, 2023 β’ 2
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Paper β’ 2312.09767 β’ Published Dec 15, 2023 β’ 27
Improving Text Embeddings with Large Language Models Paper β’ 2401.00368 β’ Published Dec 31, 2023 β’ 80
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Paper β’ 2312.02145 β’ Published Dec 4, 2023 β’ 6
Notus 7B v1 Collection Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus β’ 11 items β’ Updated Dec 11, 2024 β’ 18
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community β’ 17 items β’ Updated Jun 6, 2024 β’ 236
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper β’ 2312.00752 β’ Published Dec 1, 2023 β’ 143
Positional Description Matters for Transformers Arithmetic Paper β’ 2311.14737 β’ Published Nov 22, 2023 β’ 2
Thinking Fast and Slow in Large Language Models Paper β’ 2212.05206 β’ Published Dec 10, 2022 β’ 1