The500Feed.Live

Everything going on in AI - updated daily from 500+ sources

← Back to The 500 Feed
📄 ResearchMay 26, 2026

More Expressive Feedforward Layers: Part I. Token-Adaptive Mixing of Activations

Feedforward network (FFN) layers account for a large fraction of parameters and nonlinear expressivity in Transformer-based large language models (LLMs). Despite the evolution from ReLU and GELU to gated variants such as SwiGLU, most FFN designs still use a single fixed activation function, applying...

Read Original Article →

Source

http://arxiv.org/abs/2605.26647v1