为中华民族崛起而奋斗!加油,少年!公众号:低调奋进
Paper
Survey
| 1 | A Survey of Large Language Models | web |
| 2 | Aligning Large Language Models with Human: A Survey | web |
| 3 | A Comprehensive Overview of Large Language Models | web |
| 4 | Large Language Models | web |
| 5 | A Survey on Evaluation of Large Language Models | web |
| 6 | Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning | web |
| 7 | Challenges and Applications of Large Language Models | web |
| 8 | A Survey on Model Compression for Large Language Models | web |
| 9 | How Can Recommender Systems Benefit from Large Language Models: A Survey | web |
| 10 | A Survey of Techniques for Optimizing Transformer Inference | web |
| 11 | Instruction Tuning for Large Language Models: A Survey | web |
| 12 | The Rise and Potential of Large Language Model Based Agents: A Survey | web |
| 13 | A Survey on Model Compression and Acceleration for Pretrained Language Models | web |
| 14 | Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey | web |
| 15 | Explainability for Large Language Models: A Survey | web |
| 16 | The Rise and Potential of Large Language Model Based Agents: A Survey | web |
| 17 | TRUSTWORTHY LLMS: A SURVEY AND GUIDELINE FOR EVALUATING LARGE LANGUAGE MODELS’ ALIGNMENT | web |
| 18 | Large Language Model Alignment: A Survey | web |
| 19 | Bias and Fairness in Large Language Models: A Survey | web |
| 20 | A Survey on Fairness in Large Language Models | web |
| 21 | A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations | web |
| 22 | Towards Better Chain-of-Thought Prompting Strategies: A Survey | web |
| 23 | A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics | web |
| 24 | Augmenting LLMs with Knowledge: A survey on hallucination prevention | web |
| 25 | From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models | web |
| 26 | A Survey on Large Language Model based Autonomous Agents | web |
| 27 | Through the Lens of Core Competency: Survey on Evaluation of Large Language Models | web |
| 28 | Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback | web |
| 29 | A Survey on Hallucination in Large Language Models | web |
| 30 | Unifying Large Language Models and Knowledge Graphs: A Roadmap | web |
| 31 | How Can Recommender Systems Benefit from Large Language Models: A Survey | web |
| 32 | Large Language Models for Information Retrieval: A Survey | web |
| 33 | Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity | web |
| 29 | web |
Paper
| 1 | GPT 1 | Improving Language Understanding by Generative Pre-Training | web |
| 2 | GPT 2 | Language Models are Unsupervised Multitask Learners | web |
| 3 | GPT 3 | Language Models are Few-Shot Learners | web |
| 4 | Codex | Evaluating Large Language Models Trained on Code | web |
| 5 | InstructGPT | Training language models to follow instructions with human feedback | web |
| 6 | GPT 4 | GPT-4 Technical Report | web |
| 7 | GPT-4 | GPT-4 system card | web |
| 8 | prompt | Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language ProcessingPre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing | web |
| 9 | RLHF | Augmenting Reinforcement Learning with Human Feedback | web |
| 10 | context | What learning algorithm is in-context learning | web |
| 11 | ppo | Proximal Policy Optimization Algorithms | web |
| 12 | TAMER | Interactively Shaping Agents via Human Reinforcement | web |
| 13 | GPT-4 | Sparks of Artificial General Intelligence Early experiments with GPT-4 | web |
| 14 | Continual Pre-Training of Large Language Models: How to (re)warm your model? | web | |
| 15 | Self-Alignment with Instruction Backtranslation | web | |
| 16 | Llama 2: Open Foundation and Fine-Tuned Chat Models | web | |
| 17 | The RefinedWeb Dataset for Falcon LLM | web | |
| 18 | D4: Improving LLM Pretraining via Document De-Duplication and Diversification | web | |
| 19 | Textbooks Are All You Need | web | |
| 20 | How to Protect Copyright Data in Optimization of Large Language Models? | web | |
| 21 | Baichuan 2: Open Large-scale Language Models | web | |
| 22 | LLaMA: Open and Efficient Foundation Language Models | web | |
| 23 | SlimPajama-DC: Understanding Data Combinations for LLM Training | web | |
| 24 | Qwen technical report | web | |
| 25 | LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale | web | |
| 26 | GLM: General Language Model Pretraining with Autoregressive Blank Infilling | web | |
| 27 | GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL | web | |
| 28 | PaLM 2 Technical Report | web | |
| 29 | OPT: Open Pre-trained Transformer Language Models | web | |
| 30 | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | web | |
| 31 | Skywork_13b | Skywork: A More Open Bilingual Foundation Model | web |
| 31 | BlueLM | BuleLM | web |
| 32 | openai agi plan | openai | |
| 31 | web |
Blog
| 1 | 关于ChatGPT的思考-李理 (强烈推荐) | web |
| 2 | 后GPT 3.0时代,主流大模型技术精要详解,走向AGI之路的大门已开 | web |
| 3 | 拆解追溯 GPT-3.5 各项能力的起源 | web |
| 4 | Prompt 方法简介 | web |
| 5 | 吴恩达 prompt enginerring | web |
| 6 | 大语言模型推理性能优化综述 | web |
| 7 | 万字综述:大语言模型驱动智能体(LLM Agent)进展与潜力 By 复旦+米哈游 | web |
| 8 | 万字综述:大语言模型指令调优综述 | web |
| 9 | 一文了解大模型推理优化技术进展 | web |
| 10 | Do Machine Learning Models Memorize or Generalize? | web |
| 11 | An Initial Exploration of Theoretical Support for Language Model Data Engineering. Part 1: Pretraining | web |
| 12 | 符尧:别卷大模型训练了,来卷数据吧! | web |
openai
| 1 | openai | openai | web |
| 2 | openai chat | chat | web |
| 3 | openai platform | overview document examples playground | web |
GPT tools
| 1 | openai-cookbook | github |
| 2 | Azure OpenAI | github |
| 3 | go-openai | web |
| 4 | 注册openai | web |
| 5 | ChatPaper | github |
Code
| 1 | DeepSpeed | web | |
| 2 | Megatron-LM | web | |
| 3 | transformers | web | |
| 4 | Megatron-LLaMA | web | |
| 5 | Megatron-DeepSpeed | web | |
| 6 | ColossalAI | web | |
| 7 | BELLE | web | |
| 8 | FastChat | web | |
| 9 | langchain | web | |
| 10 | llama | web | |
| 11 | llama.cpp | web | |
| 12 | Chinese-LLaMA-Alpaca | web | |
| 13 | Llama2-Chinese | web | |
| 14 | TinyLlama | web | |
| 15 | vllm | web | |
| 16 | Firefly | web | |
| 17 | xformers | web | |
| 18 | flash-attention | web | |
| 1 | streaming-llm | web |
Dataset
English
| 1 | RedPajama | 1T tokens | web |
| 2 | Pile | 825GiB | web |
| 3 | SlimPajama | 627B | web |
| 4 | falcon-refinedweb | 1.68TB | web |
| 5 | BigScience Data | 300B | web |
| 6 | oscar | web | |
| 7 | openwebtext | web | |
| 7 | C4 | 305G | web |
Code & Math
| 1 | starcoderdata | 250B tokens | web |
| 2 | MathGLM | 3G | web |
Chinese
| 1 | MNBVC | 目标40T,一直进行中 | web |
| 2 | CLUECorpus2020 | 100G高质量语料 | web |
| 3 | xuanyuan | 开源60G,在更新 | web |
| 4 | wudao | 开源200G | web |
| 5 | TigerBot | 开源100G,英文51G,中文55G | web |
| 6 | llm-dataset-chinese-poetry | web | |
| 7 | CC-100 | 多语言中中文54G | web |
| 8 | 源1.0 | 开源1T需要申请 | web |
| 9 | CBook-150k | web | |
| 10 | awesome-chinese-legal-resources | web | |
| 11 | chinese-poetry | web | |
| 12 | commoncrawl | web | |
| 13 | SkyPile-150b | 150b | web |
Alignment(sft & rlhf)
| 1 | COIG | web | |
| 2 | ShareGPT-Chinese-English-90k | web | |
| 3 | ShareGPT52K | web | |
| 4 | belle | 3.5M_CN | web |
| 5 | databricks-dolly-15k | web | |
| 6 | alpaca-gpt4 | web | |
| 7 | GPT-4-LLM | web | |
| 8 | Cot | web | |
| 9 | InstructionWild | web | |
| 10 | GuanacoDataset | web | |
| 11 | Huatuo-Llama-Med-Chinese | web | |
| 12 | OpenOrca | web | |
| 13 | LongForm | web | |
| 14 | code_instructions_120k_alpaca | web | |
| 15 | lima | web | |
| 16 | wizard_vicuna_70k | web | |
| 17 | wizard_vicuna_70k_unfiltered | web | |
| 18 | hh-rlhf | web | |
| 19 | full-hh-rlhf | web | |
| 1 | web |