Arunkumar Venkataramanan's picture

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

AGI Research: Reasoning, Safety & Alignment (Superalignment), Generative AI (GenAI), Multi-Modal Foundation Models (FMs), Large Language Models (LLMs), Transformers & Diffusion Models, Open LLM Training, Optimization & Finetuning, Serving & Inference

Organizations

ArunkumarVR's activity

upvoted a collection 5 months ago

PaliGemma Release

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 136

upvoted 2 articles 5 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 201

Article

Making thousands of open LLMs bloom in the Vertex AI Model Garden

Apr 10

• 18

upvoted a collection 5 months ago

[lecture artifacts] aligning open language models

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 56

upvoted 2 articles 6 months ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 273

Article

Fine-tune Llama 3 with ORPO

By

•

Apr 22

• 221

upvoted a collection 6 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 9 days ago • 676

upvoted an article 6 months ago

Article

CodeGemma - an official Google release for code LLMs

Apr 9

• 99

upvoted 2 collections 6 months ago

MoEs papers reading list

58 items • Updated about 20 hours ago • 133

DBRX

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 90

upvoted a paper 7 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592

upvoted 3 collections 7 months ago

💫 StarCoder2

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 80

OpenCodeInterpreter

18 items • Updated Mar 3 • 82

OpenMath

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 4 days ago • 35

upvoted 3 papers 7 months ago

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

Paper • 2402.06619 • Published Feb 9 • 52

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41

Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20 • 94

upvoted 2 collections 8 months ago

Whisper Release

Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 79

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted 2 papers 8 months ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 67

Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 109

upvoted a collection 8 months ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 70 items • Updated 3 days ago • 84

upvoted a paper 8 months ago

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2 • 41

upvoted 6 collections 8 months ago

DPO vs KTO vs IPO

A collection of datasets and models used for the Aligning LLMs with Direct Preference Optimization Methods blogpost • 2 items • Updated Jan 16 • 11

Handbook v0.1 models and datasets

Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24

Constitutional AI

A collection of datasets and models that accompany the Constitutional AI recipe. See hf.co/blog/constitutional-ai for more details. • 9 items • Updated Feb 1 • 5

Paloma

Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains • 8 items • Updated 8 days ago • 13

Tulu V2 Suite

The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated 10 days ago • 43

OLMo Suite

Artifacts for the first set of OLMo models. • 18 items • Updated 10 days ago • 57