Sphere-Spring-2023 (Sphere Spring 2023 Class)

rajistics

posted an update about 14 hours ago

Post

553

Having some fun with long context benchmarks (watch the video!!)

NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)
Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87
Michalenglo: https://deepmind.google/research/publications/117639/
LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)
NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)
RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)

For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/

let me know if you like these posts

florentgbelidji

posted an update 3 months ago

Post

1574

𝗣𝗹𝗮𝗻𝗻𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗡𝗲𝘅𝘁 𝗦𝗸𝗶 𝗔𝗱𝘃𝗲𝗻𝘁𝘂𝗿𝗲 𝗝𝘂𝘀𝘁 𝗚𝗼𝘁 𝗦𝗺𝗮𝗿𝘁𝗲𝗿: 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁!🏔️⛷️

With the big hype around AI agents these days, I couldn’t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boring…

Passionate about outdoors, I’ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. That’s why I built 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.

Built using Hugging Face's 𝘀𝗺𝗼𝗹𝗮𝗴𝗲𝗻𝘁𝘀 library, Alpine Agent combines the power of AI with trusted resources like 𝘚𝘬𝘪𝘵𝘰𝘶𝘳.𝘧𝘳 (https://skitour.fr/) and METEO FRANCE. Whether it’s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.

In my latest blog post, I share how I developed this project—from defining tools and integrating APIs to selecting the best LLMs like 𝘘𝘸𝘦𝘯2.5-𝘊𝘰𝘥𝘦𝘳-32𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, 𝘓𝘭𝘢𝘮𝘢-3.3-70𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, or 𝘎𝘗𝘛-4.

⛷️ Curious how AI can enhance adventure planning? Try the app and share your thoughts: florentgbelidji/alpine-agent

👉 Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent

Many thanks to @m-ric for helping on building this tool with smolagents!

1 reply

·

nbroad

posted an update 6 months ago

Post

2572

https://huggingface.co/organizations/nerdyface/share/xvWxWxYmYpCLqZlvNJEZbJHFsDITAicJAT

nbroad

posted an update 6 months ago

Post

3637

hi florent and livestream!

5 replies

·

nbroad

authored a paper 6 months ago

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Paper • 2206.15076 • Published Jun 30, 2022 • 4

derek-thomas

posted an update 8 months ago

Post

2376

Here is an AI Puzzle!
When you solve it just use a 😎 emoji.
NO SPOILERS
A similar puzzle might have each picture that has a hidden meaning of summer, winter, fall, spring, and the answer would be seasons.

Its a little dated now (almost a year), so bottom right might be tough.

Thanks to @johko for the encouragement to post!

derek-thomas

posted an update about 1 year ago

Post

Defending LLMs against Jailbreaking Attacks via Backtranslation (2402.16459)
**Defending LLMs against Jailbreaking Attacks via Backtranslation**

I really love this! Its a really innovative way to get robust defense against jailbreaking. Its not cheap, 2-3 calls per user request. But for some use-cases it can be worth it!

derek-thomas

authored a paper over 1 year ago

Relation Extraction with Self-determined Graph Convolutional Network

Paper • 2008.00441 • Published Aug 2, 2020

arianpasquali

authored a paper almost 2 years ago

A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

Paper • 2304.08999 • Published Apr 18, 2023 • 2

nbroad

authored a paper about 2 years ago

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 31

Sphere Spring 2023 Class

AI & ML interests

Sphere-Spring-2023's activity

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Relation Extraction with Self-determined Graph Convolutional Network

A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

AI & ML interests

Team members 32

Sphere-Spring-2023's activity