Mohammed Hamdy

mmhamdy

AI & ML interests

TechBio | AI4Sci | NLP | Reinforcement Learning

Recent Activity

Organizations

Massive Text Embedding Benchmark's profile picture Blog-explorers's profile picture Hugging Face for Computer Vision's profile picture ASAS AI's profile picture ZeroGPU Explorers's profile picture Social Post Explorers's profile picture C4AI Community's profile picture M4-ai's profile picture LLMem's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture MOTH Lab's profile picture

Posts 9

view post
Post
1511
What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

šŸ’” Examples of ideas explored in the article:

āœ… What was the inspiration for the attention mechanism?
āœ… How did we go from attention to self-attention?
āœ… Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

Articles 2

Article
4

Pandemonium: The Transformers Story