Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text ... More @Wikipedia
Hover over any link to get a description of the article. Please note that search keywords are sometimes hidden within the full article and don't appear in the description or title.