SS
About Me
Frontier AI Paper BriefingsPokebowlClinical Trial EnrollerLittle Human Names
DisclaimersPrivacy PolicyTerms of Use
Privacy Policy·Terms of Use·Disclaimers

© 2026 Silvia Seceleanu

← Back to Explorer
Models·OpenAI·May 2020

★8. Language Models are Few-Shot Learners (GPT-3)

The model that made the world pay attention

Research Paper
Summary

Introduced GPT-3, a 175B parameter model that demonstrated remarkable few-shot learning — performing tasks from just a few examples in the prompt without any gradient updates — fundamentally changing what people expected language models could do.

Key Concepts

Large enough models learn new tasks from just a few prompt examples — no fine-tuning needed

GPT-3's defining contribution wasn't just better benchmarks — it was the discovery that sufficiently large language models can learn new tasks from just a handful of examples provided in the prompt. No fine-tuning, no gradient updates. Just show it a few input-output pairs and it figures out the pattern.

Zero-shot, one-shot, and few-shot: three prompting paradigms

• Zero-shot: describe the task, no examples • One-shot: one example • Few-shot: 2-64 examples in the prompt

At 175B parameters, abilities appeared that smaller models simply didn't have

At 175B parameters, GPT-3 displayed capabilities that smaller models simply didn't have — arithmetic, code generation, creative writing, and even rudimentary reasoning. These "emergent" abilities (capabilities that appear at scale) became a defining concept in AI.

Shift from open-source to paid API — the "Open" in OpenAI begins to erode

OpenAI released GPT-3 not as open-source but through a paid API — marking the shift from OpenAI's original open-source ethos to a commercial model.

Connections

8. Language Models …May 20207. Scaling Laws for…Jan 20209. Learning to Summ…Sep 202010. Zero-Shot Text-t…Jan 202111. Learning Transfe…Feb 202112. Evaluating Large…Aug 202115. Robust Speech Re…Sep 2022Influenced byInfluences
Influenced by
7. Scaling Laws for Neural Language Models
Jan 2020
Influences
9. Learning to Summarize from Human Feedback
Sep 2020
10. Zero-Shot Text-to-Image Generation (DALL-E)
Jan 2021
11. Learning Transferable Visual Models (CLIP)
Feb 2021
12. Evaluating Large Language Models Trained on Code (Codex)
Aug 2021
15. Robust Speech Recognition via Large-Scale Weak Supervision (Whisper)
Sep 2022