Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
onekqΒ 
posted an update 6 days ago
Post
2544
We desperately need GPU for model inference. CPU can't replace GPU.

I will start with the basics. GPU is designed to serve predictable workloads with many parallel units (pixels, tensors, tokens). So a GPU allocates as much transistor budget as possible to build thousands of compute units (Cuda cores in NVidia or execution units in Apple Silicon), each capable of running a thread.

But CPU is designed to handle all kinds of workloads. CPU cores are much larger (hence a lot fewer) with branch prediction and other complex things. In addition, more and more transistors are allocated to build larger cache (~50% now) to house the unpredictable, devouring the compute budget.

Generalists can't beat specialists.

Not completely true, you can train weightless/non-neural architectures on the CPU perfectly fine.

CPU is fine for:

  • statistics like linear regression
  • classic machine learning
    • PCA, clustering
    • decision trees, random forests, naive Bayes, SVM, XGBoost, ...
  • Lisp and Prolog symbolic AI
  • search algorithms like A* and evolutionary algorithms
  • Bayesian inference, MCMC, Gaussian process, Dirichlet process
  • small neural nets like multilayer perceptrons

Also some GPU algorithms could be implemented in ASIC hardware like bitcoin mining. So GPUs aren't as essential as you might think.

ZBT SRAM (perhaps stacked) caches all over everything packed closely together and more cores and threads, then. Something about Cyclops64 architecture still appeals to me. But then, are you just making a smaller WSE-3? Somehow, I wish OPUs would catch on for the tasks they're optimized for. But NPUs may be beating GPUs more often in common usage soon.

Β·

Agree on NPU, looking forward to seeing more of them hitting the fab.