Stephen Fernandes 's picture

Stephen Fernandes

StephennFernandes

AI & ML interests

Natural Language Processing , Reinforcement Learning

Recent Activity

Organizations

Speech Recognition Community Event Version 2's profile picture

StephennFernandes's activity

reacted to wassemgtk's post with 👀 2 days ago
view post
Post
2601
I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help?

Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb
  • 1 reply
·
New activity in ai4bharat/sangraha 12 months ago
New activity in ylacombe/w2v-bert-2.0 12 months ago
New activity in ai4bharat/sangraha about 1 year ago