Jawad Mansoor
supercharge19
AI & ML interests
NLP for text and voice (even videos)
RL with multimodaliy models (agent is able to learn human speech as well as can see and make decisions based on what it "sees")
Recent Activity
new activity
23 minutes ago
Zyphra/Zonos-v0.1-speaker-embedding:Can it be used for speaker identification task as standalone model?
new activity
about 22 hours ago
openbmb/MiniCPM-o-2_6-gguf:How to get speech synthesized and speech recognized?
new activity
1 day ago
stepfun-ai/GOT-OCR-2.0-hf:Is it really huggingface model?
Organizations
models
None public yet
datasets
None public yet