any video results from this model

#1
by GeeveGeorge - opened

hope you could share some results from this finetune ;)

Oh certainly, this was a quick upload to share for someone as in the meantime as we've not finished research and finetuning so information + more results wasn't put in on a short notice. It should be usable currently using the predict_t2v.py script included with cogvideox-fun, readme has been added now.

@MoonShinkiro it looks great. had some questions :

  1. how long were the 65 videos? what did you use for captioning
  2. Could you share the dataset preparation, captioning and finetuning script? or share some links for the same if it is available online.

would love to train some lora's as well.

@GeeveGeorge To answer your questions:

  1. The videos in the dataset are all 6 seconds long, and at 8 fps. With the provided LoRA finetuning script from CogVideo fun you can use variable resolutions and lengths so your dataset can be personalized as long as you follow their data pipeline guide found here: https://github.com/aigc-apps/CogVideoX-Fun/blob/main/cogvideox/video_caption/README.md
  2. The entire data preparation guide can be found in the link above, I just followed it and curated a dataset similar using their own guide so if I had to provide something, that's just what I used. As for the finetuning script, that's also provided by them from their /scripts directory here: https://github.com/aigc-apps/CogVideoX-Fun/blob/main/scripts/README_TRAIN_LORA.md

I had another question @MoonShinkiro how long did the training job take on the H100?
Also any predictions on how long it could take on an A100 80GB?

@MoonShinkiro any updates regarding how much GPU time it took?

@MoonShinkiro any updates regarding how much GPU time it took?

im also curious on this :D aswell as did you record the average vram usage on the h100? thank you for your work :D

Sign up or log in to comment