Adding _set_gradient_checkpointing for compatibility (#22) 8091327 gugarosa vriveras commited on Oct 17, 2023
fix(phi-1_5): Checks length of `attention_mask`if it is passed as direct tensor. f9f2ac7 gugarosa commited on Sep 26, 2023