6 52 68

Xingye

PlanetMoon

PlanetMoon

AI & ML interests

Time series, Foundation Model, Machine Learning, Artificial Intelligence.

Organizations

PlanetMoon's activity

upvoted a paper about 2 months ago

Language Model Can Listen While Speaking

Paper • 2408.02622 • Published Aug 5 • 37

upvoted a paper 3 months ago

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 22

upvoted a collection 3 months ago

BigVGAN

Collection

BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input. • 11 items • Updated 4 days ago • 9

upvoted a paper 3 months ago

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4 • 35

upvoted an article 4 months ago

Article

Let's talk about LLM evaluation

•

May 23

• 108

upvoted a collection 5 months ago

Standard-format-preference-dataset

Collection

We collect the open-source datasets and process them into the standard format. • 14 items • Updated May 8 • 19

upvoted a paper 5 months ago

FlashSpeech: Efficient Zero-Shot Speech Synthesis

Paper • 2404.14700 • Published Apr 23 • 29

upvoted an article 5 months ago

Article

Introducing the Open Chain of Thought Leaderboard

Apr 23

• 23

upvoted 2 papers 7 months ago

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5 • 34

Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

Paper • 2402.15504 • Published Feb 23 • 21

upvoted a collection 10 months ago

Seamless Communication

Collection

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 146

upvoted a paper 12 months ago

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 77

upvoted 39 papers about 1 year ago

Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

Paper • 2309.15223 • Published Sep 26, 2023 • 19

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

Paper • 2309.15807 • Published Sep 27, 2023 • 32

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

Paper • 2309.15103 • Published Sep 26, 2023 • 42

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Paper • 2309.11674 • Published Sep 20, 2023 • 31

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Paper • 2309.12311 • Published Sep 21, 2023 • 17

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Paper • 2309.11998 • Published Sep 21, 2023 • 24

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 24

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82

Text-Guided Generation and Editing of Compositional 3D Avatars

Paper • 2309.07125 • Published Sep 13, 2023 • 6

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Paper • 2309.04564 • Published Sep 8, 2023 • 15

Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16

NExT-GPT: Any-to-Any Multimodal LLM

Paper • 2309.05519 • Published Sep 11, 2023 • 78

Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 86

Large-Scale Automatic Audiobook Creation

Paper • 2309.03926 • Published Sep 7, 2023 • 53

GPT Can Solve Mathematical Problems Without a Calculator

Paper • 2309.03241 • Published Sep 6, 2023 • 17

CausalLM is not optimal for in-context learning

Paper • 2308.06912 • Published Aug 14, 2023 • 18

Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

Paper • 2308.07316 • Published Aug 14, 2023 • 6

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Paper • 2308.06873 • Published Aug 14, 2023 • 25

PIPPA: A Partially Synthetic Conversational Dataset

Paper • 2308.05884 • Published Aug 11, 2023 • 29

Composable Function-preserving Expansions for Transformer Architectures

Paper • 2308.06103 • Published Aug 11, 2023 • 19

Pre-Trained Large Language Models for Industrial Control

Paper • 2308.03028 • Published Aug 6, 2023 • 6

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

Paper • 2308.01320 • Published Aug 2, 2023 • 44

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Paper • 2307.15818 • Published Jul 28, 2023 • 27

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Paper • 2307.15217 • Published Jul 27, 2023 • 36

Med-Flamingo: a Multimodal Medical Few-shot Learner

Paper • 2307.15189 • Published Jul 27, 2023 • 22

Towards Generalist Biomedical AI

Paper • 2307.14334 • Published Jul 26, 2023 • 12

Brain2Music: Reconstructing Music from Human Brain Activity

Paper • 2307.11078 • Published Jul 20, 2023 • 41

Meta-Transformer: A Unified Framework for Multimodal Learning

Paper • 2307.10802 • Published Jul 20, 2023 • 43

How is ChatGPT's behavior changing over time?

Paper • 2307.09009 • Published Jul 18, 2023 • 23

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 240

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Paper • 2307.09458 • Published Jul 18, 2023 • 10

Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts

Paper • 2307.07218 • Published Jul 14, 2023 • 26

Copy Is All You Need

Paper • 2307.06962 • Published Jul 13, 2023 • 33

upvoted a paper over 1 year ago

ChatGPT for Robotics: Design Principles and Model Abilities

Paper • 2306.17582 • Published Feb 20, 2023 • 10