Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
takarajordanΒ 
posted an update 4 days ago
Post
502
🎌 Two months in, https://github.com/takara-ai/go-attention has passed 429 stars on GitHub.

We built this library at takara.ai to bring attention mechanisms and transformer layers to Go β€” in a form that's lightweight, clean, and dependency-free.

We’re proud to say that every part of this project reflects what we set out to do.

- Pure Go β€” no external dependencies, built entirely on the Go standard library
- Core support for DotProductAttention and MultiHeadAttention
- Full transformer layers with LayerNorm, feed-forward networks, and residual connections
- Designed for edge, embedded, and real-time environments where simplicity and performance matter

Thank you to everyone who has supported this so far β€” the stars, forks, and feedback mean a lot.

What is the performance like compared to ggml and llama.cpp?
Also do it support quantization?

Thanks

Β·

@ThomasTheMaker it's just the raw attention and transformer architecture in golang designed for serverless so performance will definitely be less than ggml and llama.cpp since it's not accelerated by GPU's but if you're into edge AI CPU only, this is the first, only and best way to compute attention.

Quantization can definitely be supported as it's just a math model!