Cohere Labs Community

community

https://cohere.com/research

Cohere_Labs

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

shayekh authored a paper 2 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

danylo-boiko authored a paper 2 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

1024m authored a paper 3 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

View all activity

C4AI-Community's activity

jjzha

authored a paper 2 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published 3 days ago • 3

shayekh

authored a paper 2 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published 3 days ago • 3

danylo-boiko

authored a paper 2 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published 3 days ago • 3

sanderland

authored a paper 10 days ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 11 days ago • 23

beyzaermis

authored a paper 10 days ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 11 days ago • 23

johndang-cohere

authored a paper 10 days ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 11 days ago • 23

viraat

authored a paper 10 days ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 11 days ago • 23

mmhamdy

posted an update 13 days ago

Post

1556

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

💡 Examples of ideas explored in the article:

✅ What was the inspiration for the attention mechanism?
✅ How did we go from attention to self-attention?
✅ Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

Aurelien-Morgan

posted an update 14 days ago

Post

1967

Almost there !
https://test.pypi.org/project/test-010-retrain-pipelines/

louisbrulenaudet

posted an update 20 days ago

Post

891

I’ve just released logfire-callback on PyPI, designed to facilitate monitoring of Hugging Face Transformer training loops using Pydantic Logfire 🤗

The callback will automatically log training start with configuration parameters, periodic metrics and training completion ⏱️

Install the package using pip:

pip install logfire-callback

First, ensure you have a Logfire API token and set it as an environment variable:

export LOGFIRE_TOKEN=your_logfire_token

Then use the callback in your training code:

from transformers import Trainer, TrainingArguments
from logfire_callback import LogfireCallback

# Initialize your model, dataset, etc.

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    # ... other training arguments
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    callbacks=[LogfireCallback()]  # Add the Logfire callback here
)

trainer.train()