[Sonstiges] Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training
I replicated David Ng's RYS method (https://dnhkng.github.io/posts/rys/) on consumer AMD GPUs (RX 7900 XT + RX 6950 XT) and found something I didn't expect. Transformers appear to have discrete "reasoning…