Scroll down to find out more, or click here to apply now.
Large pretrained language models such as ChatGPT, GPT-4, ChatGLM have been very popular these years, and artificial general intelligence has been approaching our daily life. In this lecture, we will present the model architecture, training algorithms, pretraining thinking behind these large language models. Specifically, we will introduce the transformer structure with details, and discuss about how to pretrain models, supervised fine-tuning models, and do alignment with human feedback. Furthermore, we will present how some typical models (for instance, GPT, GLM) were built. Finally, we will talk some interesting applications such as typical generation tasks, multi-modality understanding and generation, or even code generation.
Legal Intelligence, which aims to empower legal systems with AI techniques, has attracted considerable attention lately due to its potential benefits and impacts on human society. Pretrained language models, such as BERT and GPT, have emerged as powerful techniques that have driven a transformative revolution in many fields such as Machine Learning, Natural Language Processing, and Information Retrieval. Despite their success in general, how to apply pretrained language modeling methods to legal domains with strong knowledge background and high requirements for reliability is still an unsolved question. In this talk, we will briefly introduce the unique challenges and opportunities of applying pretrained language models to legal problems such as legal document retrieval, summarization, judgment prediction, etc. We will summarize recent studies on this topic, and focus on the discussion of three important directions: (1) the design of legal-specific pretraining tasks for legal language model construction; (2) the adaptation of general large foundation models to legal domains; and (3) the evaluation of large language model’s ability in legal tasks.
This lecture gives a general introduction to neural networks and deep learning. First, a brief history of deep learning will be sketched. Second, an overview of deep learning models and applications will be provided. Third, related math and machine learning basics will be reviewed. Finally, the roadmap of the summer school is introduced.
LLMs should be further adapted to complete specific tasks. This talk will introduce advanced technique of LLM adaptation including supervised fine-tuning, reinforcement learning from human feedback, and retrieval-agumented generation. Further, this talk will also introduce recent advances on autonomous agents powered by LLMs and multi-agent collective AI and the applications in software development and other areas.
Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this lecture, I will introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT aligns different biological modalities with natural language via a large generative language model, and thus allows users to easily “communicate” with diverse biological modalities through free text. BioMedGPT enables users to upload biological data of molecular structures and protein sequences and pose natural language queries about these data instances. This capability can potentially accelerate the discovery of novel molecular structures and protein functionalities, thus catalyzing advancements in drug development.
Diffusion probabilistic models have shown great promise on multimodal generation, including visual and auditory data. In the lecture, I will present diffusion models for multimodal generation, including the basic principles of diffusion models, the fast inference algorithms, a large-scale pretrained foundation model UniDiffuser, as well as efficient 3D content generation and video generation built on UniDiffuser.
Training large foundation models is challenging. it is required to have careful tradeoff among parallelism, communication traffic, memory usage and recompute mechanisms. In addition, training on hundreds and thousands GPUs/NPUs also demand effective fault-tolerant support. On the other hand, large model inference need techniques such as model quantization and request batching to reduce cost and minimize response time. In this lecture, we will discuss these system perspectives of large foundation models.
Embodied intelligence emphasizes that the intelligence is affected by the synergy of brain, body and environment. It is continuously and dynamically generated through the process of information perception and physical interaction when the body interacts with the environment. In recent years, the scope of the embodied intelligence has also been expending. On the other hand, the Large Language Model(LLM) has received renewed attention. Leveraging LLM for robotic application also received great attention during the past several years. Especially, the emergency of GPT and other LLM has brought many new paradigms for robotic applications. All of these attempts show the strong capability of LLM for robotics. Though the introduction of these technologies may bring new ideas and opportunities for the application of embodied intelligence, many key challenges of the embodied intelligence have not been really solved. In this course, we aim to comprehensively analyze the connotation and extension of the embodied intelligence. Some key works across the morphology, action, perception and learning in the literature are also highlighted. We further focus on the correlation across these aspects and identify areas where future research can benefit from the intrinsic connection between them.
LLM has widespread applications and has revolutionized many industries, but suffers from several challenges. First, sufficient high-quality training data is inevitable for producing a well-performed model, but the data is always human expensive to acquire. Second, a large amount of training data and complicated model structures lead to the inefficiency of training and inference. Fortunately, database techniques can benefit LLM by addressing the above challenges. In this talk, I will review existing studies from the following three aspects along with the pipeline highly related to ML. (1) Data preparation (Pre-ML): it focuses on preparing high-quality training data that can improve the performance of the ML model, where we review data discovery, data cleaning and data labeling. (2) Model training & inference (In-ML): researchers in ML community focus on improving the model performance during training, while in this talk I mainly discuss how to accelerate the entire training process, also including feature selection and model selection.
Large language models (LLMs) have significantly advanced the state of the art in AI. This lecture will give an overview introduction to LLMs. First, we will discuss the brief history of deep neural networks and their evolution to the Transformer architecture; We will then explore the paradigm shift from supervised learning to self-supervised pre-training highlighting Transformer as a fundamental support in this transition. Additionally, we will briefly introduce the pre-training and alignment techniques that lead to the highly-capable LLMs. Throughout this lecture, we will use the open ChatGLM models developed by Tsinghua and partners as a case study to illustrate the development of LLMs. Notably, ChatGLM is an open alternative to ChatGPT and has attracted over 13,000,000 downloads on Hugging Face. Finally, we will conclude this lecture by examining the important and advanced directions in LLMs as well as vision language models, setting the stage for future lectures in the course.