Update Sonoma model with faster 8x8 conv and split einsum attention dba673f smpanaro commited on Aug 15
Update sequoia mode with transposed value cache and 4:508 input:cache length 722eedf verified smpanaro commited on Jul 25