| 1 |
GPT 1 |
Improving Language Understanding by Generative Pre-Training |
web |
| 2 |
GPT 2 |
Language Models are Unsupervised Multitask Learners |
web |
| 3 |
GPT 3 |
Language Models are Few-Shot Learners |
web |
| 4 |
Codex |
Evaluating Large Language Models Trained on Code |
web |
| 5 |
InstructGPT |
Training language models to follow instructions with human feedback |
web |
| 6 |
GPT 4 |
GPT-4 Technical Report |
web |
| 7 |
GPT-4 |
GPT-4 system card |
web |
| 8 |
prompt |
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language ProcessingPre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing |
web |
| 9 |
RLHF |
Augmenting Reinforcement Learning with Human Feedback |
web |
| 10 |
context |
What learning algorithm is in-context learning |
web |
| 11 |
ppo |
Proximal Policy Optimization Algorithms |
web |
| 12 |
TAMER |
Interactively Shaping Agents via Human Reinforcement |
web |
| 13 |
GPT-4 |
Sparks of Artificial General Intelligence Early experiments with GPT-4 |
web |