The model that made the world pay attention
Research PaperIntroduced GPT-3, a 175B parameter model that demonstrated remarkable few-shot learning — performing tasks from just a few examples in the prompt without any gradient updates — fundamentally changing what people expected language models could do.
GPT-3's defining contribution wasn't just better benchmarks — it was the discovery that sufficiently large language models can learn new tasks from just a handful of examples provided in the prompt. No fine-tuning, no gradient updates. Just show it a few input-output pairs and it figures out the pattern.
• Zero-shot: describe the task, no examples • One-shot: one example • Few-shot: 2-64 examples in the prompt
At 175B parameters, GPT-3 displayed capabilities that smaller models simply didn't have — arithmetic, code generation, creative writing, and even rudimentary reasoning. These "emergent" abilities (capabilities that appear at scale) became a defining concept in AI.
OpenAI released GPT-3 not as open-source but through a paid API — marking the shift from OpenAI's original open-source ethos to a commercial model.