Apology and Notification
Apology and Notification
Perhaps after you noticed and fixed it yourself, but I had created an extremely easy but fatal bug. My apologies.π
- Wrong code (to destroy CLIP)
k.replace("vae.", "").replace("model.diffusion_model.", "")\
.replace("text_encoders.clip_l.transformer.text_model.", "")\
.replace("text_encoders.t5xxl.transformer.", "")
- Correct code
k.replace("vae.", "").replace("model.diffusion_model.", "")\
.replace("text_encoders.clip_l.transformer.", "")\
.replace("text_encoders.t5xxl.transformer.", "")
Also, the FLUX.1 model is too large to operate in my local environment, so the results are from testing only on HF's free CPU space, but there was some behavior that I was curious about.
On some models, huggingface_hub.save_torch_state_dict freezes without sending an error or raising exception.
It is also decidedly only when saving a transformer (unet).
I traced it with print(f""), which is a paleolithic method, and it is hard to believe that RAM, CPU, or disk usage is the cause; I confirmed that it works fine until huggingface_hub.split_torch_state_dict_into_shards.
So it is probably failing in the internal safetensors.torch.save_model part.
I hope it's just a lack of specs and stuck in some weird place...
If you have problems with save_pretrained, suspect this.
Specifically I have seen this occur when saving with torch.float8_e4m3fn on the following model.
https://huggingface.co/datasets/John6666/flux1-backup-202408/blob/main/theAraminta_flux1A1.safetensors
Thanks for the discussion
Sorry to report in an unrelated repo (or not?) due to an emergency.
Thank you for your constant development.π€
P.S.
Regarding the above problem, I ran it experimentally in Zero GPU space (without any code changes and without using the GPU directly) and the problem did not occur.
I am relieved to know that it is not a bug in the library (and/or my code), but a lack of VM specs.
Although I did not see any difference in RAM (including page files) usage, it may be the result of a difference in the VM's underlying performance other than the GPU, or some burden is implicitly offloaded to the GPU or VRAM.
Anyway, sorry for the trouble.