imnotednamode
/

mochi-1-preview-mix-nf4

Diffusers

Safetensors

4-bit precision

bitsandbytes

Model card Files Files and versions Community

imnotednamode commited on 12 days ago

Commit

bd67dc6

•

1 Parent(s): 0a13e38

add a readme (what is this anyway)

Browse files

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -1,3 +1,21 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+This mixes mochi with a development version of diffusers to achieve high quality fast inference with the full 161 frames on a single 24gb card. This repo contains only the transformer. After installing the mochi development branch with `pip install git+https://github.com/huggingface/diffusers@mochi`, it can be loaded normally and used in a pipeline like so:
+```
+from diffusers import MochiPipeline, MochiTransformer3DModel
+from diffusers.utils import export_to_video
+transformer = MochiTransformer3DModel.from_pretrained("imnotednamode/mochi-1-preview-mix-nf4")
+pipe = MochiPipeline.from_pretrained("mochi-1-diffusers", torch_dtype=torch.bfloat16, transformer=transformer)
+pipe.enable_model_cpu_offload()
+pipe.enable_vae_tiling()
+frames = pipe("A camera follows a squirrel running around on a tree branch", num_inference_steps=100, guidance_scale=4.5, height=480, width=848, num_frames=161).frames[0]
+export_to_video(frames, "mochi.mp4", fps=15)
+```
+In the above, you must also use the `convert_mochi_to_diffuser.py` script from https://github.com/huggingface/diffusers/pull/9769 to convert https://huggingface.co/genmo/mochi-1-preview to the diffusers format.
+I've noticed raising the guidance_scale will allow the model to make a coherent output with less steps, but also reduces motion, as the model is trying to align mostly with the text prompt.
+This version works by mixing nf4 weights and bf16 weights together. I notice that using pure nf4 weights degrades the model quality significantly, but using bf16 weights means the full 161 frames can't fit into vram. This version strikes a balance (most weights are in bf16).