โ„ฮตsam (@Hesamation)

2025-08-09 | โค๏ธ 715 | ๐Ÿ” 106


The perfect weekend article just dropped by @rasbt on gpt-oss architectures.

Itโ€™s wild to see how fast these architectures have leveled up since GPT-2 in 2019.

abs positional emb โ†’ RoPE GELU MLP โ†’ Swish + SwiGLU one dense feed-forward โ†’ MoE Multi-head attention โ†’ GQA โ€ฆ https://x.com/Hesamation/status/1954237350512619868/photo/1

๋ฏธ๋””์–ด

image


Tags

LLM