README.md · finetrainers/cakeify-v0 at d4885b6b15765e2f68d2788e231c0f29d5d59957

metadata

base_model: THUDM/CogVideoX-5b
datasets: finetrainers/cakeify-smol
library_name: diffusers
license: other
license_link: https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE
instance_prompt: >-
  PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife
  appears and slices through the cup, revealing a cake inside. The cake turns
  into a hyper-realistic prop cake, showcasing the creative transformation of
  everyday objects into something unexpected and delightful.
widget:
  - text: >-
      PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife
      appears and slices through the soap, revealing a cake inside. The soap
      turns into a hyper-realistic prop cake, showcasing the creative
      transformation of everyday objects into something unexpected and
      delightful.
    output:
      url: ./assets/output_0.mp4
  - text: >-
      PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse
      quietly commands attention. Suddenly, a knife appears and slices through
      the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it
      turns into a hyper-realistic prop cake, delighting the senses with its
      playful juxtaposition of the everyday and the extraordinary.
    output:
      url: ./assets/output_1.mp4
  - text: >-
      PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a
      knife appears and slices through the cup, revealing a cake inside. The
      cake turns into a hyper-realistic prop cake, showcasing the creative
      transformation of everyday objects into something unexpected and
      delightful.
    output:
      url: ./assets/output_2.mp4
tags:
  - text-to-video
  - diffusers-training
  - diffusers
  - cogvideox
  - cogvideox-diffusers
  - template:sd-lora

Prompt: PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife appears and slices through the soap, revealing a cake inside. The soap turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.

Prompt: PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.

Prompt: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.

This is a fine-tune of the THUDM/CogVideoX-5b model on the finetrainers/cakeify-smol dataset. We also provide a LoRA variant of the params. Check it out here.

Code: https://github.com/a-r-r-o-w/finetrainers

This is an experimental checkpoint and its poor generalization is well-known.

Inference code:

from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

Training logs are available on WandB here.

LoRA

We extracted a 64-rank LoRA from the finetuned checkpoint (script here). This LoRA can be used to emulate the same kind of effect:

from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)