genmo/mochi-1-preview · clip_feat_dim unexpected by AsymmetricAttention.__init_

3 days ago

•

Hi, I'm trying to hack on the code a bit to see if I can get this to run on a single <48Gb node and ran into a few problems.

clip_feat_dim from the yaml isn't expected but it is passed by **block_kwargs into AsymmetricAttention.init which won't accept it.

I was able to work around this by adding **block_kwargs to the init function so any extra kwargs are effectively ignored, but just a heads up.

Also this assert fails in t2v_synth_mochi.py:

    assert y_feat[-1].shape == (B, MAX_T5_TOKEN_LENGTH, 4096)

It seemingly matches, but the tensor.shape function returns a tensor, not a tuple:

    print(f"y_feat[-1].shape = {y_feat[-1].shape}")

Output:

    (T2VSynthMochiModel pid=3652095) y_feat[-1].shape = torch.Size([2, 256, 4096])

I temporarily corrected it to this, which works:

assert y_feat[-1].shape == torch.zeros(B, MAX_T5_TOKEN_LENGTH, 4096).shape

Edit: for anyone else trying to downsize, set num_workers = 1 in infer.py line 35, if I can get it working I'll share code. Trying 16bit and possibly bitsandbytes to see if I can get it down a bit...

panopstor

3 days ago

Fork live here:
https://github.com/victorchall/genmoai-smol
Going to close since I got it working.

panopstor changed discussion status to closed 3 days ago

ved-genmo

Genmo org 3 days ago

Both issues should now be fixed! Thanks for the detailed bug report, and lmk if you run into anything else.

genmo
/

mochi-1-preview

clip_feat_dim unexpected by AsymmetricAttention.init()

clip_feat_dim unexpected by AsymmetricAttention.__init__()

clip_feat_dim unexpected by AsymmetricAttention.init()