为中华民族崛起而奋斗!加油,少年!公众号:低调奋进
Paper
Survey
1 | A Survey of Large Language Models | web |
2 | Aligning Large Language Models with Human: A Survey | web |
3 | A Comprehensive Overview of Large Language Models | web |
4 | Large Language Models | web |
5 | A Survey on Evaluation of Large Language Models | web |
6 | Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning | web |
7 | Challenges and Applications of Large Language Models | web |
8 | A Survey on Model Compression for Large Language Models | web |
9 | How Can Recommender Systems Benefit from Large Language Models: A Survey | web |
10 | A Survey of Techniques for Optimizing Transformer Inference | web |
11 | Instruction Tuning for Large Language Models: A Survey | web |
12 | The Rise and Potential of Large Language Model Based Agents: A Survey | web |
13 | A Survey on Model Compression and Acceleration for Pretrained Language Models | web |
14 | Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey | web |
15 | Explainability for Large Language Models: A Survey | web |
16 | The Rise and Potential of Large Language Model Based Agents: A Survey | web |
17 | TRUSTWORTHY LLMS: A SURVEY AND GUIDELINE FOR EVALUATING LARGE LANGUAGE MODELS’ ALIGNMENT | web |
18 | Large Language Model Alignment: A Survey | web |
19 | Bias and Fairness in Large Language Models: A Survey | web |
20 | A Survey on Fairness in Large Language Models | web |
21 | A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations | web |
22 | Towards Better Chain-of-Thought Prompting Strategies: A Survey | web |
23 | A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics | web |
24 | Augmenting LLMs with Knowledge: A survey on hallucination prevention | web |
25 | From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models | web |
26 | A Survey on Large Language Model based Autonomous Agents | web |
27 | Through the Lens of Core Competency: Survey on Evaluation of Large Language Models | web |
28 | Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback | web |
29 | A Survey on Hallucination in Large Language Models | web |
30 | Unifying Large Language Models and Knowledge Graphs: A Roadmap | web |
31 | How Can Recommender Systems Benefit from Large Language Models: A Survey | web |
32 | Large Language Models for Information Retrieval: A Survey | web |
33 | Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity | web |
29 | web |
Paper
1 | GPT 1 | Improving Language Understanding by Generative Pre-Training | web |
2 | GPT 2 | Language Models are Unsupervised Multitask Learners | web |
3 | GPT 3 | Language Models are Few-Shot Learners | web |
4 | Codex | Evaluating Large Language Models Trained on Code | web |
5 | InstructGPT | Training language models to follow instructions with human feedback | web |
6 | GPT 4 | GPT-4 Technical Report | web |
7 | GPT-4 | GPT-4 system card | web |
8 | prompt | Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language ProcessingPre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing | web |
9 | RLHF | Augmenting Reinforcement Learning with Human Feedback | web |
10 | context | What learning algorithm is in-context learning | web |
11 | ppo | Proximal Policy Optimization Algorithms | web |
12 | TAMER | Interactively Shaping Agents via Human Reinforcement | web |
13 | GPT-4 | Sparks of Artificial General Intelligence Early experiments with GPT-4 | web |
14 | Continual Pre-Training of Large Language Models: How to (re)warm your model? | web | |
15 | Self-Alignment with Instruction Backtranslation | web | |
16 | Llama 2: Open Foundation and Fine-Tuned Chat Models | web | |
17 | The RefinedWeb Dataset for Falcon LLM | web | |
18 | D4: Improving LLM Pretraining via Document De-Duplication and Diversification | web | |
19 | Textbooks Are All You Need | web | |
20 | How to Protect Copyright Data in Optimization of Large Language Models? | web | |
21 | Baichuan 2: Open Large-scale Language Models | web | |
22 | LLaMA: Open and Efficient Foundation Language Models | web | |
23 | SlimPajama-DC: Understanding Data Combinations for LLM Training | web | |
24 | Qwen technical report | web | |
25 | LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale | web | |
26 | GLM: General Language Model Pretraining with Autoregressive Blank Infilling | web | |
27 | GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL | web | |
28 | PaLM 2 Technical Report | web | |
29 | OPT: Open Pre-trained Transformer Language Models | web | |
30 | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | web | |
31 | Skywork_13b | Skywork: A More Open Bilingual Foundation Model | web |
31 | BlueLM | BuleLM | web |
32 | openai agi plan | openai | |
31 | web |
Blog
1 | 关于ChatGPT的思考-李理 (强烈推荐) | web |
2 | 后GPT 3.0时代,主流大模型技术精要详解,走向AGI之路的大门已开 | web |
3 | 拆解追溯 GPT-3.5 各项能力的起源 | web |
4 | Prompt 方法简介 | web |
5 | 吴恩达 prompt enginerring | web |
6 | 大语言模型推理性能优化综述 | web |
7 | 万字综述:大语言模型驱动智能体(LLM Agent)进展与潜力 By 复旦+米哈游 | web |
8 | 万字综述:大语言模型指令调优综述 | web |
9 | 一文了解大模型推理优化技术进展 | web |
10 | Do Machine Learning Models Memorize or Generalize? | web |
11 | An Initial Exploration of Theoretical Support for Language Model Data Engineering. Part 1: Pretraining | web |
12 | 符尧:别卷大模型训练了,来卷数据吧! | web |
openai
1 | openai | openai | web |
2 | openai chat | chat | web |
3 | openai platform | overview document examples playground | web |
GPT tools
1 | openai-cookbook | github |
2 | Azure OpenAI | github |
3 | go-openai | web |
4 | 注册openai | web |
5 | ChatPaper | github |
Code
1 | DeepSpeed | web | |
2 | Megatron-LM | web | |
3 | transformers | web | |
4 | Megatron-LLaMA | web | |
5 | Megatron-DeepSpeed | web | |
6 | ColossalAI | web | |
7 | BELLE | web | |
8 | FastChat | web | |
9 | langchain | web | |
10 | llama | web | |
11 | llama.cpp | web | |
12 | Chinese-LLaMA-Alpaca | web | |
13 | Llama2-Chinese | web | |
14 | TinyLlama | web | |
15 | vllm | web | |
16 | Firefly | web | |
17 | xformers | web | |
18 | flash-attention | web | |
1 | streaming-llm | web |
Dataset
English
1 | RedPajama | 1T tokens | web |
2 | Pile | 825GiB | web |
3 | SlimPajama | 627B | web |
4 | falcon-refinedweb | 1.68TB | web |
5 | BigScience Data | 300B | web |
6 | oscar | web | |
7 | openwebtext | web | |
7 | C4 | 305G | web |
Code & Math
1 | starcoderdata | 250B tokens | web |
2 | MathGLM | 3G | web |
Chinese
1 | MNBVC | 目标40T,一直进行中 | web |
2 | CLUECorpus2020 | 100G高质量语料 | web |
3 | xuanyuan | 开源60G,在更新 | web |
4 | wudao | 开源200G | web |
5 | TigerBot | 开源100G,英文51G,中文55G | web |
6 | llm-dataset-chinese-poetry | web | |
7 | CC-100 | 多语言中中文54G | web |
8 | 源1.0 | 开源1T需要申请 | web |
9 | CBook-150k | web | |
10 | awesome-chinese-legal-resources | web | |
11 | chinese-poetry | web | |
12 | commoncrawl | web | |
13 | SkyPile-150b | 150b | web |
Alignment(sft & rlhf)
1 | COIG | web | |
2 | ShareGPT-Chinese-English-90k | web | |
3 | ShareGPT52K | web | |
4 | belle | 3.5M_CN | web |
5 | databricks-dolly-15k | web | |
6 | alpaca-gpt4 | web | |
7 | GPT-4-LLM | web | |
8 | Cot | web | |
9 | InstructionWild | web | |
10 | GuanacoDataset | web | |
11 | Huatuo-Llama-Med-Chinese | web | |
12 | OpenOrca | web | |
13 | LongForm | web | |
14 | code_instructions_120k_alpaca | web | |
15 | lima | web | |
16 | wizard_vicuna_70k | web | |
17 | wizard_vicuna_70k_unfiltered | web | |
18 | hh-rlhf | web | |
19 | full-hh-rlhf | web | |
1 | web |