k (candidates)
2048
H_I (indexer heads)
64
d_I (indexer dim)
128
Reset to DeepSeek-V3.2
C_attn
—
C_idx
—
C_idx / C_attn
—
Savings at 128K
—
Linear scale
Log scale
Vanilla MLA
(C_attn · L)
DSA
(C_idx · L + C_attn · k)
Indexer
(C_idx · L)
Sparse attn
(C_attn · k)