How do you deal with the fact, that different layers in ds are in different data types? I try to run the model on gpus with 60GB and need to use FSDP.
· Sign up or log in to comment