WizardLM Team

community

WizardLM_AI

nlpxucan/WizardLM

Activity Feed

AI & ML interests

Large Language Models

Recent Activity

Ziyang authored a paper about 1 month ago

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

wenxcs authored a paper 3 months ago

GRIN: GRadient-INformed MoE

Ziyang authored a paper 7 months ago

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

View all activity

WizardLMTeam's activity

Ziyang

authored a paper about 1 month ago

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20 • 17

wenxcs

authored a paper 3 months ago

GRIN: GRadient-INformed MoE

Paper • 2409.12136 • Published Sep 18 • 15

WizardLM

posted an update 6 months ago

Post

10677

🔥 🔥🔥
Excited to announce WizardLM new Paper: Auto Evol-Instruct!

🐦 Twitter: https://x.com/WizardLM_AI/status/1812857977122202087

📃 Paper: https://arxiv.org/pdf/2406.00770

🤖 1. Fully AI-Powered Pipeline

Auto Evol-Instruct automatically involves an iterative process of optimizing an Evol-Instruct V1 into an optimal one. The pipeline consists of two critical stages: Evol Trajectory Analysis, where the optimizer LLM analyzes the issues and failures exposed in instruction evolution performed by the evol LLM, and Evolving Method Optimization, where the optimizer LLM addresses these issues to progressively develop an effective evolving method. The optimal evolving method is then used to convert the entire instruction dataset into more diverse and complex forms, facilitating improved instruction tuning.

📈2. Scaling Evol-Instruct with Arena Learning

With Auto Evol-Instruct, the evolutionary synthesis data of WizardLM-2 has scaled up from WizardLM-1 to dozens of domains, covering tasks in all aspects of large language models. This allows Arena Learning to train and learn from an almost infinite pool of high-difficulty instruction data, fully unlocking all the potential of Arena Learning.

1 reply

Ziyang

authored 4 papers 7 months ago

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Paper • 2406.07476 • Published Jun 11 • 32

CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

Paper • 2405.00253 • Published Apr 30

CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models

Paper • 2405.00390 • Published May 1

Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models

Paper • 2401.13298 • Published Jan 24

WizardLM

updated a Space 8 months ago

Running

🦀

README

nlpxucan

authored 2 papers 8 months ago

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Paper • 2308.09583 • Published Aug 18, 2023 • 7

WizardLM: Empowering Large Language Models to Follow Complex Instructions

Paper • 2304.12244 • Published Apr 24, 2023 • 13

Ziyang

authored a paper 9 months ago

MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems

Paper • 2404.09486 • Published Apr 15 • 1

WizardLM

posted an update 9 months ago

Post

39310

🔥🔥🔥 Introducing WizardLM-2!

📙Release Blog: https://wizardlm.github.io/WizardLM2
✅Model Weights: microsoft/wizardlm-661d403f71e6c8257dbd598a
🐦Twitter: https://twitter.com/WizardLM_AI/status/1779899325868589372

We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.

WizardLM-2 8x22B is our most advanced model, and the best opensource LLM in our internal evaluation on highly complex tasks. WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size. WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.

🤗 WizardLM 2 Capacities:

1. MT-Bench (Figure-1)
The WizardLM-2 8x22B even demonstrates highly competitive performance compared to the most advanced proprietary works such as GPT-4-Trubo and Glaude-3. Meanwhile, WizardLM-2 7B and WizardLM-2 70B are all the top-performing models among the other leading baselines at 7B to 70B model scales.

2. Human Preferences Evaluation (Figure 2)
Through this human preferences evaluation, WizardLM-2's capabilities are very close to the cutting-edge proprietary models such as GPT-4-1106-preview, and significantly ahead of all the other open source models.

🔍Method Overview:
As the natural world's human-generated data becomes increasingly exhausted through LLM training, we believe that: the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI.

In the past one year, we built a fully AI powered synthetic training system. (As shown in the Figure 3).