Models·OpenAI·Mar 2023

★18. GPT-4 Technical Report

State-of-the-art performance, unprecedented secrecy

Research Paper

Summary

Introduced GPT-4, a large multimodal model accepting text and image inputs, achieving human-level performance on many professional exams (bar exam 90th percentile) while revealing almost nothing about its architecture, training data, or size.

Key Concepts

First production model accepting both text and image inputs

GPT-4 accepted both text and image inputs (output was text only). This was OpenAI's first production multimodal model.

90th percentile on bar exam (vs. GPT-3.5's 10th), passed USMLE and AP exams

GPT-4 scored in the 90th percentile on the bar exam (vs. GPT-3.5's 10th percentile), passed the USMLE, and achieved strong scores on the GRE, SATs, and AP exams.

Technical report revealed nothing about size, architecture, data, or training methods

The "Technical Report" contained no information about: model size, architecture details, training data, compute used, hardware, or training methodology. OpenAI cited competitive pressure and safety concerns.

Extensive red-teaming including ARC evaluation of autonomous replication risk

The paper detailed extensive red-teaming and safety mitigations, including working with external organizations (e.g., ARC for evaluating autonomous replication).

Connections

Influenced by

16. ChatGPT: Optimizing Language Models for Dialogue

Nov 2022

Influences

19. Let's Verify Step by Step

May 2023

20. Introducing Superalignment

Jul 2023

21. GPT-4V(ision) System Card

Sep 2023

22. OpenAI DevDay 2023: GPT-4 Turbo, Custom GPTs, Assistants API

Nov 2023

25. Sora: Creating video from text

Feb 2024