Microsoft AI introduces Orca, a 13-billion parameter model that learns to imitate the reasoning process of LFMs.
These models’ remarkable zero-shot capabilities have led to a question. Can they supervise themselves or other models without human intervention? Microsoft researchers have developed Orca, which is a model with 13 billion parameters that can learn complex explanation traces as well as step-by-step thinking processes. This innovative approach improves performance by a significant margin compared to existing state-of the-art instructions-tuned model, while addressing challenges such as task diversity, query complexes, and data scalability.
Researchers acknowledge that GPT-4 query-and-response pairs can be a valuable guide for student models. They enhance these pairs with detailed responses to help better understand the reasoning process used by teachers in generating their answers. Orca’s explanation traces help students to improve their reasoning and comprehension by bridging the gap that exists between teachers and student models.
The team uses the Flan 2022 collection to further enhance Orca’s ability to learn. To ensure that the challenges are varied, the team samples tasks from an extensive collection. The team sub-samples these tasks to create complex prompts that serve as questions for LFMs. This creates a rich and diverse training set for Orca that allows it to learn a variety of tasks.
Source: