Summer School Content

Tsinghua University, Beijing


Learn about Large Language Models in summer of 2024

Scroll down to find out more, or click here to apply now.

Course Description

Prerequisites: Calculus and linear algebra at college level; familiar with Python


Minlie Huang: Pretrained Language Models

Large pretrained language models such as ChatGPT, GPT-4, ChatGLM have been very popular these years, and artificial general intelligence has been approaching our daily life. In this lecture, we will present the model architecture, training algorithms, pretraining thinking behind these large language models. Specifically, we will introduce the transformer structure with details, and discuss about how to pretrain models, supervised fine-tuning models, and do alignment with human feedback. Furthermore, we will present how some typical models (for instance, GPT, GLM) were built. Finally, we will talk some interesting applications such as typical generation tasks, multi-modality understanding and generation, or even code generation.


Qingyao Ai:Pretrain Language Modeling for Legal Intelligence

Legal Intelligence, which aims to empower legal systems with AI techniques, has attracted considerable attention lately due to its potential benefits and impacts on human society. Pretrained language models, such as BERT and GPT, have emerged as powerful techniques that have driven a transformative revolution in many fields such as Machine Learning, Natural Language Processing, and Information Retrieval. Despite their success in general, how to apply pretrained language modeling methods to legal domains with strong knowledge background and high requirements for reliability is still an unsolved question. In this talk, we will briefly introduce the unique challenges and opportunities of applying pretrained language models to legal problems such as legal document retrieval, summarization, judgment prediction, etc. We will summarize recent studies on this topic, and focus on the discussion of three important directions: (1) the design of legal-specific pretraining tasks for legal language model construction; (2) the adaptation of general large foundation models to legal domains; and (3) the evaluation of large language model’s ability in legal tasks.


Zhiyuan Liu: Neural Networks and Deep Learning

This lecture gives a general introduction to neural networks and deep learning. First, a brief history of deep learning will be sketched. Second, an overview of deep learning models and applications will be provided. Third, related math and machine learning basics will be reviewed. Finally, the roadmap of the summer school is introduced.


Zhiyuan Liu: Multi-Agent Collective AI Powered by LLMs

LLMs can not only communicate with humans via natural languages, but also can communicate via natural languages with each other as AI agents. It may significantly facilitate the development of multi-agent AI. This lecture introduces the recent advances of multi-agent collective AI powered by LLMs, the applications in software development and other areas, and remaining challenges for improving the effectiveness and efficiency of LLM agents.


Zaiqing Nie: Biomedicine

Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this lecture, I will introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT aligns different biological modalities with natural language via a large generative language model, and thus allows users to easily “communicate” with diverse biological modalities through free text. BioMedGPT enables users to upload biological data of molecular structures and protein sequences and pose natural language queries about these data instances. This capability can potentially accelerate the discovery of novel molecular structures and protein functionalities, thus catalyzing advancements in drug development.


Jun Zhu: Multimodal Generation with Diffusion Models

Diffusion probabilistic models have shown great promise on multimodal generation, including visual and auditory data. In the lecture, I will present diffusion models for multimodal generation, including the basic principles of diffusion models, the fast inference algorithms, a large-scale pretrained foundation model UniDiffuser, as well as efficient 3D content generation and video generation built on UniDiffuser.


Wenguang Chen: Distributed Systems for Large Foundation Model Training and Inference

Training large foundation models is challenging. it is required to have careful tradeoff among parallelism, communication traffic, memory usage and recompute mechanisms. In addition, training on hundreds and thousands GPUs/NPUs also demand effective fault-tolerant support. On the other hand, large model inference need techniques such as model quantization and request batching to reduce cost and minimize response time. In this lecture, we will discuss these system perspectives of large foundation models.


Huaping Liu: Emobodied Intelligence Powered by Large Lanuange Models

Embodied intelligence emphasizes that the intelligence is affected by the synergy of brain, body and environment. It is continuously and dynamically generated through the process of information perception and physical interaction when the body interacts with the environment. In recent years, the scope of the embodied intelligence has also been expending. On the other hand, the Large Language Model(LLM) has received renewed attention. Leveraging LLM for robotic application also received great attention during the past several years. Especially, the emergency of GPT and other LLM has brought many new paradigms for robotic applications. All of these attempts show the strong capability of LLM for robotics. Though the introduction of these technologies may bring new ideas and opportunities for the application of embodied intelligence, many key challenges of the embodied intelligence have not been really solved. In this course, we aim to comprehensively analyze the connotation and extension of the embodied intelligence. Some key works across the morphology, action, perception and learning in the literature are also highlighted. We further focus on the correlation across these aspects and identify areas where future research can benefit from the intrinsic connection between them.


Guoliang Li: Data Science for LLM

LLM has widespread applications and has revolutionized many industries, but suffers from several challenges. First, sufficient high-quality training data is inevitable for producing a well-performed model, but the data is always human expensive to acquire. Second, a large amount of training data and complicated model structures lead to the inefficiency of training and inference. Fortunately, database techniques can benefit LLM by addressing the above challenges. In this talk, I will review existing studies from the following three aspects along with the pipeline highly related to ML. (1) Data preparation (Pre-ML): it focuses on preparing high-quality training data that can improve the performance of the ML model, where we review data discovery, data cleaning and data labeling. (2) Model training & inference (In-ML): researchers in ML community focus on improving the model performance during training, while in this talk I mainly discuss how to accelerate the entire training process, also including feature selection and model selection.