webiraiz
webiraiz
AI & ML interests
None yet
Recent Activity
upvoted a paper about 21 hours ago
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment upvoted a paper 6 days ago
Adaptive Teacher Exposure for Self-Distillation in LLM ReasoningOrganizations
None yet