license: afl-3.0 | |
A wider Baby Berta Model trained using curriculum learning and layer stacking for the BabyLM Challenge Strict Small track. |
license: afl-3.0 | |
A wider Baby Berta Model trained using curriculum learning and layer stacking for the BabyLM Challenge Strict Small track. |