Encyclopedia Autonomica

Encyclopedia Autonomica

Group Sequence Policy Optimization vs Group Relative Policy Optimization

Two generational advancements in a duel of dynamic decision dynamics

Jan Daniel Semrau (MFin, CAIO)'s avatar
Jan Daniel Semrau (MFin, CAIO)
Aug 11, 2025
∙ Paid

Remember when earlier this year, DeepSeek’s release of DeepSeekMath and Group Relative Policy Optimization (GRPO) made training a state-of-the-art model dramatically cheaper? The follow-up release of DeepSeek-R1 amplified the effect, and the result was a sudden sell-off of AI stocks. Then only a few months later, Alibaba’s Qwen team introduced Group Seq…

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 JDS
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture