Junlin Zhou

jlzhou

AI & ML interests

None yet

Recent Activity

Organizations

TableGPT's profile picture

jlzhou's activity

upvoted an article 11 days ago
view article
Article

You could have designed state of the art positional encoding

ā€¢ 195
reacted to KaiChen1998's post with šŸ‘ 13 days ago
view post
Post
4800
šŸ“¢ Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!

šŸ¤— EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.

āœØ EMOVA Highlights
āœ… State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
āœ… Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
āœ… Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model, even including the most recent DeepSeekMoE-tiny!

šŸ”„ You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo
upvoted an article 16 days ago
view article
Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

ā€¢ 355
upvoted an article 20 days ago
view article
Article

From Files to Chunks: Improving Hugging Face Storage Efficiency

ā€¢ 55
reacted to Kseniase's post with šŸ‘ 27 days ago
view post
Post
6146
9 types of "Chain-of-..." approaches:

Chain-of-Thought (CoT) prompting enhances reasoning in AI models by breaking down complex problems into step-by-step logical sequences. It continues proving its effectiveness, especially in top-performing reasoning models. However, there are other similar methods, that expand CoT and can be used for different purposes. Here are 9 of them:

1. Chain-of-Action-Thought (COAT) -> Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search (2502.02508)
Helps model decide when to keep thinking, double-check their work, or try a different approach, using special guiding tokens.

2. Chain of Draft (CoD) -> Chain of Draft: Thinking Faster by Writing Less (2502.18600)
It helps model generate short but meaningful reasoning steps, cutting costs and making processing faster

3. Chain-of-Agents -> Chain of Agents: Large Language Models Collaborating on Long-Context Tasks (2406.02818)
Uses multi-agent collaboration: Worker agents process text parts in a structured chain, and manager agent summarizes the results

4. Chain-of-RAG ->https://huggingface.co/papers/2501.14342
Creates retrieval chains, instead of retrieving all info at once. It can dynamically adjust its search process and its parameters like step number

5. Chain-of-Shot Prompting (CoS) -> CoS: Chain-of-Shot Prompting for Long Video Understanding (2502.06428)
Helps models pick frames crucial for understanding a video, using a binary video summary and video co-reasoning module.

6. Chain of Hindsight (CoH) -> Chain of Hindsight Aligns Language Models with Feedback (2302.02676)
Converts all feedback into sequences to fine-tune the model and refine outputs

7. Chain-of-Note (CoN) -> Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models (2311.09210)
Generates sequential reading notes for each retrieved document to assess relevance before integrating info into the final answer

8. Chain of Diagnosis (CoD) -> CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis (2407.13301)
Transforms the diagnostic process into a diagnostic chain

9. Chain(s)-of-Knowledge -> https://www.turingpost.com/p/cok
Enhance LLMs by dynamically pulling in external knowledge to improve accuracy and reduce errors
view reply

I'm glad you found it helpful!

Yes, this is planned. I was originally planning to write an article about training with the training operator, but now I'm wondering if I should skip that and focus on training with the new trainer instead.

PS: Kubeflow is migrating their training component from v1 (Kubeflow Training Operator) to v2 (Kubeflow Trainer).

updated a collection about 2 months ago