Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction Paper • 2202.00441 • Published Feb 1, 2022 • 1
Memory-Efficient Backpropagation through Large Linear Layers Paper • 2201.13195 • Published Jan 31, 2022 • 1