README.md · finetrainers/cakeify-v0 at 45dcbd82cad6e41445c58a8509e6fbb3c1916a60

metadata

base_model: THUDM/CogVideoX-5b
datasets: finetrainers/cakeify-smol
library_name: diffusers
license: other
license_link: https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE
instance_prompt: >-
  PIKA DISSOLVE A pristine snowglobe featuring a winter scene sits peacefully.
  The globe violently explodes, sending glass, water, and glittering fake snow
  in all directions. The scene is captured with high-speed photography.
widget:
  - text: >-
      PIKA_DISSOLVE A meticulously detailed, tea cup, sits centrally on a dark
      brown circular pedestal. The cup, seemingly made of clay, begins to
      dissolve from the bottom up. The disintegration process is rapid but not
      explosive, with a cloud of fine, light tan dust forming and rising in a
      swirling, almost ethereal column that expands outwards before slowly
      descending. The dust particles are individually visible as they float, and
      the overall effect is one of delicate disintegration rather than
      shattering. Finally, only the empty pedestal and the intricately patterned
      marble floor remain.
    output:
      url: ./assets/output_cup.mp4
  - text: >-
      PIKA_DISSOLVE Resting quietly atop an ancient stone altar, a delicately
      carved wooden mask starts to crumble from its outer edges. The intricate
      patterns crack and give way, releasing a fine, smoke-like plume of
      mahogany-hued particles that dance upwards, then disperse gradually into
      the hushed atmosphere. As the dust descends, the once captivating mask is
      reduced to an outline on the weathered altar.
    output:
      url: ./assets/output_altar.mp4
  - text: >-
      PIKA_DISSOLVE A slender glass vase, brimming with tiny white pebbles,
      stands centered on a polished ebony dais. Without warning, the glass
      begins to dissolve from the edges inward. Wisps of translucent dust swirl
      upward in an elegant spiral, illuminating each pebble as they drop onto
      the dais. The gently drifting dust eventually settles, leaving only the
      scattered stones and faint traces of shimmering powder on the stage.
    output:
      url: ./assets/output_vase.mp4
  - text: >-
      PIKA_DISSOLVE On a narrow marble ledge, a gracefully folded paper crane
      rests, its surface marked by delicate ink lines. It starts to fragment
      from the tail feathers outward, releasing a cloud of feather-light pulp
      fibers. Suspended for a moment in a magical swirl, the fibers drift back
      down, cloaking the ledge in a near-transparent veil of white. Then the
      ledge stands empty, the crane’s faint silhouette lingering in memory.
    output:
      url: ./assets/output_marble.mp4
tags:
  - text-to-video
  - diffusers-training
  - diffusers
  - cogvideox
  - cogvideox-diffusers
  - template:sd-lora

Prompt: PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife appears and slices through the soap, revealing a cake inside. The soap turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.

Prompt: PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.

Prompt: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.

This is a fine-tune of the THUDM/CogVideoX-5b model on the finetrainers/cakeify-smol dataset.

Code: https://github.com/a-r-r-o-w/finetrainers

This is an experimental checkpoint and its poor generalization is well-known.

Inference code:

from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_vase.mp4", fps=25)

Training logs are available on WandB here.