language-models 2 Lost in Backpropagation: The LM Head is a Gradient Bottleneck Mar 10, 2026 Gaperon: A Peppered English-French Generative Language Model Suite Oct 29, 2025