SubQ Claims 12 Million Token Context with 52x Speed Gain Over FlashAttention
A new sparse-attention architecture called SubQ promises the first fully sub-quadratic frontier model, processing 12 million tokens at a fraction of the compute cost.
Subscribe to unlock all stories
Get full access to The Singularity Ledger, archive included.
Cancel anytime. Payments powered by Stripe.