ARWKV

ARWKV🪿

Paper Link👁️ | Github

ARWKV-7B-GATE-MLP (Preview 0.1)

ARWKV Hybrid Architecture

Preview version with RWKV-7 time mixing and Transformer MLP

📌 Overview

ALL YOU NEED IS RWKV

This is an early preview of our 7B parameter RNN-based model, trained on 2k context length (only stage-2 applied, without SFT or DPO) through 3-stage knowledge distillation from Qwen2.5-7B-Instruct. While being a foundational version, it demonstrates:

  • ✅ RWKV-7's efficient recurrence mechanism
  • ✅ No self-attention, fully O(n)
  • ✅ Constant VRAM usage
  • ✅ Single-GPU trainability

Roadmap Notice: We will soon open-source different enhanced versions with:

  • 🚀 16k+ context capability
  • 🧮 Math-specific improvements
  • 📚 RL enhanced reasoning model

How to use

pip3 install --upgrade rwkv-fla transformers
from transformers import AutoModelForCausalLM, AutoTokenizer


model = AutoModelForCausalLM.from_pretrained(
    "RWKV-Red-Team/ARWKV-7B-Preview-0.1",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "RWKV-Red-Team/ARWKV-7B-Preview-0.1"
)

🔑 Key Features

Component Specification Note
Architecture RWKV-7 TimeMix + SwiGLU Hybrid design
Context Window 2048 training CTX Preview limitation
Training Tokens 40M Distillation-focused
Precision FP16 inference recommended(16G Vram required) 15%↑ vs BF16

🏗️ Architecture Highlights

Core Modification Flow

Qwen2.5 Decoder Layer:
- Grouped Query Attention
+ RWKV-7 Time Mixing (Eq.3)
- RoPE Positional Encoding
+ State Recurrence
= Hybrid Layer Output
Downloads last month
310
Safetensors
Model size
8.29B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Model tree for RWKV-Red-Team/ARWKV-7B-Preview-0.1

Finetuned
(11)
this model
Finetunes
1 model

Spaces using RWKV-Red-Team/ARWKV-7B-Preview-0.1 4

Collection including RWKV-Red-Team/ARWKV-7B-Preview-0.1