Encyclopedia Autonomica

Encyclopedia Autonomica

Paper Review | Orca 2 Teaching Small Language Models How to Reason

Why did the reasoning expert become a philosopher? Because they wanted to elevate their logical deductions to the heights of abstract contemplation, one 'why' at a time!

Jan Daniel Semrau (MFin, CAIO)'s avatar
Jan Daniel Semrau (MFin, CAIO)
Nov 30, 2023
โˆ™ Paid

For most of us running inference on a large local language model is a pipe dream (๐Ÿฅ).

Still.

I agree with Yann LeCun, that the road to running high quality language models locally is paved through improving smaller (7B, 13B) open-source models by making them available for developers and researchers.

Yet most smaller models, for obvious reasons, do not have the same emergent capabilities like Zero-Shot Reasoning as large models. The Orca 2 paper outlines approaches how reasoning capabilities can be improved in smaller models.

Goal: How can we teach smaller LMs to reason better? Is it possible to teach smaller models how to use a suite of reasoning techniques and be capable to discern when to apply which technique.

Problem: Imitation learning from larger models does not improve reasoning capabilities in smaller models. Orca 1 has shown inferior reasoning and understanding capabilities in comparison with GPT-3.5.

Solution : Improve Orca 2โ€™s reasoning and deduction capabilities by training it with an expanded, highly tailored synthetic dataset. Introduce cautions reasoning โ€” slow thinking through a step-by-step deduction process to identify an optimal solution.

Opinion : The paper itself can only be understood in relation to several other papers. Yet reading them is worthwhile, because it allows for a deeper understanding of the techniques the authors use to develop reasoning skills in smaller language models. That in itself makes it interesting. The creation of synthetic data is a plus.

Ollama, Paper, Web

User's avatar

Continue reading this post for free, courtesy of Jan Daniel Semrau (MFin, CAIO).

Or purchase a paid subscription.
ยฉ 2026 JDS ยท Privacy โˆ™ Terms โˆ™ Collection notice
Start your SubstackGet the app
Substack is the home for great culture