lab's picture
3

lab

lab212

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

None yet

lab212's activity

replied to chansung's post about 2 months ago
reacted to chansung's post with ๐Ÿ‘ about 2 months ago
view post
Post
1868
๐ŸŽ™๏ธ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.

Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.

๐ŸŽ™๏ธ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/

This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)

"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer

This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!
ยท
reacted to nicolay-r's post with ๐Ÿง  3 months ago
view post
Post
1008
๐Ÿ“ข Two weeks ago I got a chance to share the most recent reasoning ๐Ÿง  capabilities of Large Language models in Sentiment Analysis NLPSummit-2024.

For those who missed and still wish to find out the advances of GenAI in that field, the recording is now available:
https://www.youtube.com/watch?v=qawLJsRHzB4

You will be aware of:
โ˜‘๏ธ how well LLMs reasoning can be used for reasoning in sentiment analysis as in Zero-shot-Learning,
โ˜‘๏ธ how to improve reasoning by applying and leaving step-by-step chains (Chain-of-Thought)
โ˜‘๏ธ how to prepare the most advanced model in sentiment analysis using Chain-of-Thought.

Links:
๐Ÿ“œ Paper: Large Language Models in Targeted Sentiment Analysis (2404.12342)
โญ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework
reacted to reach-vb's post with ๐Ÿ‘ 3 months ago
view post
Post
3128
NEW: Open Source Text/ Image to video model is out - MIT licensed - Rivals Gen-3, Pika & Kling ๐Ÿ”ฅ

> Pyramid Flow: Training-efficient Autoregressive Video Generation method
> Utilizes Flow Matching
> Trains on open-source datasets
> Generates high-quality 10-second videos
> Video resolution: 768p
> Frame rate: 24 FPS
> Supports image-to-video generation

> Model checkpoints available on the hub ๐Ÿค—: rain1011/pyramid-flow-sd3
reacted to KingNish's post with โค๏ธ 8 months ago
view post
Post
5116
Introducing OpenGPT-4o
KingNish/OpenGPT-4o

Features:
1๏ธโƒฃ Inputs possible are Text โœ๏ธ, Text + Image ๐Ÿ“๐Ÿ–ผ๏ธ, Audio ๐ŸŽง, WebCam๐Ÿ“ธ
and outputs possible are Image ๐Ÿ–ผ๏ธ, Image + Text ๐Ÿ–ผ๏ธ๐Ÿ“, Text ๐Ÿ“, Audio ๐ŸŽง
2๏ธโƒฃ Flat 100% FREE ๐Ÿ’ธ and Super-fast โšก.
3๏ธโƒฃ Publicly Available before GPT 4o.

Future Features:
1๏ธโƒฃ Chat with PDF (Both voice and text)
2๏ธโƒฃ Video generation.
3๏ธโƒฃ Sequential Image Generation.
4๏ธโƒฃ Better UI and customization.

Note: It's not possible to reach level of complexity of GPT 4o because OpenAI has been developing GPT-4o from six months with a team of over 450+ experienced members, Whereas I am only One. Moreover, they haven't released it fully publicly, So, it remains a test model.
ยท