Summer School Content

Tsinghua University, Beijing


Learn about Generative AI in summer of 2025

Scroll down to find out more, or click here to apply now.

Course Description

Prerequisites: Calculus and linear algebra at college level; familiar with Python


Jie Tang: The Road to AGI

Large language models have substantially advanced the state of the art in various AI tasks, such as natural language understanding and text generation, and image processing, multimodal modeling. In this lecture, I will first introduce the development of AI in the past decades, in particular from the angle of China. We will also talk about the opportunities, challenges, and risks of AGI in the future. In the second part of the talk, we will use GLM, an alternative but open sourced model to ChatGPT, as an example to explain our understandings and insights derived during the implementation of the model.


Xiaolin Hu: Basics of Deep Learning

Deep learning is a foundation of generative AI. In this lecture, we are going to study basic models including multilayer Perceptron, convolutional neural networks and transformer, on which the generative models such as diffusion models and LLMs are built. We’ll introduce some basic training technics including stochastic gradient decent, momentum, learning rate annealing. The assignment is about language translation. Students are required to train a translation model on a given dataset.


Minlie Huang: Deep Learning for Natural Language Processing

In this talk, Prof Minlie Huang will talk about language modeling and the neural models for language. Starting from statistical language models, he will introduce neural language modeling, word vectors, and pretrained language models. He will also introduce models tailored for language modeling, including recursive autoencoder, recurrent neural networks, and transformers. These models are also suitable for general sequence modeling. Topics include: Language modeling and word vectors, Autoencoder and recursive autoencoder for language, Recurrent neural networks, Convolutional neural networks, Transformer basics.


Yuxiao Dong: Introduction to Large Language Models and Multimodal Models

Large Language Models (LLMs) represent transformative advancements in artificial intelligence, reshaping how machines understand and generate human language. This lecture will provide a comprehensive introduction to these technologies, beginning with the foundational concepts and historical context that have led to their rapid evolution. First, the session will get into the important aspects of LLMs such as scaling laws (compute, data, and model size), pretraining objectives (e.g., GPT, BERT, GLM), inference-time scaling (e.g., OpenAI o1, DeepSeek r1). Second, we will explore the ongoing shift from language to visual language models (VLMs). We will address visual-language alignment, exploring how multimodal training allows models to bridge the semantic gap between textual and visual information effectively. We will examine state-of-the-art methods for LLMs/VLMs, discussing training techniques, datasets, and evaluation methods. Finally, we will talk about how LLMs and VLMs are used in real-world applications. Participants will learn about the strengths and limitations of these models, helping to start discussions about their future uses and ethical concerns in AI research.


Jun Zhu: Generative AI Basics

This lecture will cover the basic principles and classical methods of generative AI. In particular, we will talk about the probabilistic approach to estimate distributions from finite samples in a high-dimensional space, namely probabilistic machine learning, including the classical models with proper assumptions and the efficient parameter estimate algorithms. These classical models (e.g., naïve Bayes, mixture of experts, state-space models) and algorithms (e.g., MLE, EM, variational inference, MCMC) will set up a foundation for subsequent studies on advanced generative AI.


Jun Zhu: Diffusion Models

This lecture will talk about diffusion models. We will start with the basic principles of diffusion processes and present the computational models particularly for image/video data. Then, we will talk about the efficient algorithms to solve diffusion processes (i.e., diffusion ODE/SDE) as well as the scalable architecture by conjoining with transformer (known as diffusion transformer). Finally, we will talk about multiple example diffusion models for generating various types of high-dimensional data, including images, videos, 3D contents, and audios.


Zhiyuan Liu: LLM Agent

LLMs are capable of mastering general knowledge from big data and following human instructions. In the future, we want LLMs to take actions autonomously in real world, i.e., LLM agents interacting with the world. It requires LLMs to master up-to-date domain knowledge, tools, and workflows, and collaborate with each other. In this talk, we will introduce the techniques of LLM agents and outlook the opening challenges.


Guoliang Li: Data Science for LLM

In this talk, we discuss the challenges of data systems for LLM. The LLM life-cycle includes pretraining (and incremental pretraining), fine-tuning (SFT and RLHF), prompting, RAG, and LLM Agent. We first present how to prepare high quality data for pretraining and fine-tuning, including data discovery, selection, cleaning, augmentation, labeling, mixing and synthesis. We then discuss the techniques and systems to improve the performance of pretraining and fine-tuning.


Wenguang Chen: Distributed Systems for Large Foundation Model Training and Inference

Training large foundation models is challenging. it is required to have careful tradeoff among parallelism, communication traffic, memory usage and recompute mechanisms. In addition, training on hundreds and thousands GPUs/NPUs also demand effective fault-tolerant support. On the other hand, large model inference needs techniques such as model quantization and request batching to reduce cost and minimize response time. The popular MOE(mixture of experts) models gives further challenges on both LLM training and inference. In this lecture, we will discuss these system perspectives of large foundation models.


Yang Liu: Generative AI for Medicine

The past several years have witnessed the rapid development of generative artificial intelligence (AI), which can produce text, images, videos, or other forms of data using big models. Offering previously unseen capabilities, generative AI has shown great potential in revolutionizing medicine in many aspects such as accurate diagnoses, personalized treatment plans, and personal health management. In this talk, I will introduce recent advances in generative AI for medicine such as medical big models and autonomous agents powered by large language models, especially in China. The talk closes with a discussion on future directions in generative AI for medicine.


Qingyao Ai: Large Language Model for Legal Intelligence

Legal Intelligence, which aims to empower legal systems with AI techniques, has attracted considerable attention lately due to its potential benefits and impacts on human society. Large language models, such as GPT and GLM, have emerged as powerful techniques that have driven a transformative revolution in many fields such as Machine Learning, Natural Language Processing, and Information Retrieval. Despite their success in general, how to apply these techniques to legal domains with strong knowledge background and high requirements for reliability is still an unsolved question. In this talk, we will briefly introduce the unique challenges and opportunities of applying large language models to legal problems such as legal document retrieval, summarization, judgment prediction, etc. We will summarize recent studies on this topic, and focus on the discussion of three important directions: (1) the design of legal-specific pretraining tasks for legal language model construction; (2) the adaptation of general large foundation models to legal domains; and (3) the evaluation of large language model’s ability in legal tasks.