1 |
GPT 1 |
Improving Language Understanding by Generative Pre-Training |
web |
2 |
GPT 2 |
Language Models are Unsupervised Multitask Learners |
web |
3 |
GPT 3 |
Language Models are Few-Shot Learners |
web |
4 |
Codex |
Evaluating Large Language Models Trained on Code |
web |
5 |
InstructGPT |
Training language models to follow instructions with human feedback |
web |
6 |
prompt |
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language ProcessingPre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing |
web |
7 |
RLHF |
Augmenting Reinforcement Learning with Human Feedback |
web |
8 |
GPT 4 |
GPT-4 Technical Report |
web |
9 |
GPT-4 |
GPT-4 system card |
web |
10 |
context |
What learning algorithm is in-context learning |
web |
11 |
ppo |
Proximal Policy Optimization Algorithms |
web |
12 |
TAMER |
Interactively Shaping Agents via Human Reinforcement |
web |