NeoBERT is a next-generation encoder model for English text representation, pre-trained from scratch on the RefinedWeb dataset.