Rohitkhatri75436 (Rohit Khatri)

Llama 3.1 405B Instruct beats GPT-4o on MixEval-Hard

Just ran MixEval for 405B, Sonnet-3.5 and 4o, with 405B landing right between the other two at 66.19

The GPT-4o result of 64.7 replicated locally but Sonnet-3.5 actually scored 70.25/69.45 in my replications 🤔 Still well ahead of the other 2 though.

Sammple of 1 of the eval calls here: https://wandb.ai/morgan/MixEval/weave/calls/07b05ae2-2ef5-4525-98a6-c59963b76fe1

Quick auto-logging tracing for openai-compatible clients and many more here: https://wandb.github.io/weave/quickstart/

reacted to fdaudens's post with 🔥 11 months ago

Post

2194

Do you want to improve AI in your language? Here's how you can help.

I'm exploring different AI techniques for an upcoming project in journalism, and I wanted to test a cool idea by @davanstrien , Data is better together, which aims to foster a community of people to create DPO datasets in different languages.

This project gives the opportunity to explore various concepts:
- Direct Preference Optimization (DPO)
- Synthetic data
- Data annotation
- LLM as a judge

1️⃣ Take the Aya dataset of human-annotated prompt-completion pairs across 71 languages and filter it to include only those in the language you’re interested in.

2️⃣ Use distilabel from Argilla to generate a second response for each prompt and evaluate which response is best.

Basicaly, DPO datasets have a chosen and a rejected responses to a question, which helps align models on specific tasks. To quote Daniel: "Currently, there are only a few DPO datasets available for a limited number of languages. By generating more DPO datasets for different languages, we can help to improve the quality of generative models in a wider range of languages."

3️⃣ Send this dataset and evaluations to the easy-to-use interface to evaluate the evaluations.

This is where you can help. :) You can rate the LLM evaluation of the prompt-responses pairs. For my example, I built a dataset in French. And without wanting to start a debate about homeopathy, the second result is clearly better in the example below! fdaudens/demo-aya-dpo-french

The final dataset can be found here: fdaudens/aya_french_dpo

To contribute to other languages and learn more about synthetic data, you can also produce datasets in the language of your choice! Read more about the project: https://github.com/huggingface/data-is-better-together/blob/main/dpo/README.md

1 reply

·

reacted to lamhieu's post with 🔥 11 months ago

Post

1399

🎉 Happy to announce about the collection called "Blackhole". It is a black hole of high quality data in many fields, multilingual to train LLMs with SFT and DPO methods.
📦 There are now over 30++ high-quality datasets available so you can start creating interesting models. It will be updated in the future, glad if it helps someone.

lamhieu/blackhole-66473b7feec034b4fb70818a

reacted to singhsidhukuldeep's post with ❤️ 11 months ago

Post

1778

How many times have you said Pandas is slow and still kept on using it? 🐼💨

Get ready to say Pandas can be fast but it's expensive 😂

🙌 Original Limitations:

💻 CPU-Bound Processing: Traditional pandas operations are CPU-bound (mostly single-threaded😰), leading to slower processing of large datasets.

🧠 Memory Constraints: Handling large datasets in memory-intensive operations can lead to inefficiencies and limitations.

𝌣 Achievements with @nvidia RAPIDS cuDF:

🚀 GPU Acceleration: RAPIDS cuDF leverages GPU computing. Users switch to GPU-accelerated operations without modifying existing pandas code.

🔄 Unified Workflows: Seamlessly integrates GPU and CPU operations, falling back to CPU when necessary.

📈 Optimized Performance: With extreme parallel operation opportunity of GPUs, this achieves up to 150x speedup in data processing, demonstrated through benchmarks like DuckDB.

😅New Limitations:

🎮 GPU Availability: Requires a GPU (not everything should need a GPU)

🔄 Library Compatibility: Currently in the initial stages, all the functionality cannot be ported

🐢 Data Transfer Overhead: Moving data between CPU and GPU can introduce latency if not managed efficiently. As some operations still run on the CPU.

🤔 User Adoption: We already had vectorization support in Pandas, people just didn't use it as it was difficult to implement. We already had DASK for parallelization. It's not that solutions didn't exist

Blog: https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/

For Jupyter Notebooks:

%load_ext cudf.pandas
import pandas as pd

For python scripts:

python -m cudf.pandas script.py

reacted to KingNish's post with 🚀🔥 11 months ago

Post

3753

New Updates OpenGPT 4o
1. Live Chat (also known as video chat) (very powerful and fast, it can even identify famous places and persons)
2. Powerful Image Generation

Test and give feedback of New features:
KingNish/OpenGPT-4o

Future Updates
1. PDF Chat
2. Human like speech (Using Parler tts expresso)
3. Multilingual support for voice chat

Suggest more features that should be added. 🤗

Edit: Live Chat is now very powerful (than prev)