Grokked Transformers are Implicit Reasoners.
A mechanistic journey to the edge of generalization.
We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with.
Join the discussion on this paper page.
Leave a reply