10 7

Alan Tseng

agentlans

agentlans

AI & ML interests

Small data, boring AI

Recent Activity

replied to onekq's post 1 day ago

This is bitter lesson 2.0 https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf If this reads too lofty to you, consider some low-hanging fruits. Experiences here are reward signals we send to LLMs, e.g. human score in RLHF, verification in AlphaProof, or test results for code generation. RFT (reinforced finetuning) will become main stream, and IMO make LLMs behave more like agents.

updated a model 1 day ago

agentlans/Qwen2.5-1.5B-Instruct-Keywords

published a model 1 day ago

agentlans/Qwen2.5-1.5B-Instruct-Keywords

View all activity

Organizations

None yet

agentlans's activity

replied to onekq's post 1 day ago

I'm skeptical about the big conclusions in the paper, especially about human society.

AI agent experience is fundamentally different from human experience. Yes, you can give AI multimodal inputs. You can let AI roam around and explore freely, but AI agents aren't limited by their biology and physiology.

They don't get thirsty or hungry.
They don't feel emotions or pain.
They don't grow old, get sick, or die.
They don't reproduce and evolve as a species.

And this gem from the paper:

Perhaps even more importantly, the agent could recognise when its behaviour is triggering human concern, dissatisfaction, or distress, and adaptively modify its behaviour to avoid these negative consequences.

So the AI agent must somehow learn empathy from reward signal data, even though it has no human values or experiences.

updated a model 1 day ago

agentlans/Qwen2.5-1.5B-Instruct-Keywords

Updated 1 day ago

published a model 1 day ago

agentlans/Qwen2.5-1.5B-Instruct-Keywords

Updated 1 day ago

replied to nyuuzyou's post 1 day ago

Yeah, the pretraining is important.
And SmolLM2's English tokenizer and small vocab size makes it hard to adapt to other languages (especially Chinese).
On the other hand, I trained a French chatbot on a multilingual base and it's better than expected. Also Apache 2.0 like yours.

replied to merterbak's post 1 day ago

I wonder how relevant these benchmarks actually are in practice. For example, if you have pictures of human bodies, a sculptor, a surgeon, a beautician, and an athlete will probably describe them completely differently. All of them can be "correct" but in different incomparable ways.

updated a model 1 day ago