Gemstones: A Model Suite for Multi-Faceted Scaling Laws Paper • 2502.06857 • Published 12 days ago • 22
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 12 days ago • 108
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published May 27, 2024 • 52
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published May 27, 2024 • 52
Baseline Defenses for Adversarial Attacks Against Aligned Language Models Paper • 2309.00614 • Published Sep 1, 2023 • 2
Rethinking LLM Memorization through the Lens of Adversarial Compression Paper • 2404.15146 • Published Apr 23, 2024
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22, 2024 • 44
NEFTune: Noisy Embeddings Improve Instruction Finetuning Paper • 2310.05914 • Published Oct 9, 2023 • 14