搜索网站、位置和人员
走进西湖
院系设置
betway必威棋牌网址
招生与奖励
新闻与活动
校园生活
betway下载地址
人才招聘
师生入口
新闻与活动活动信息
西湖工程讲座系列第79期 | Chunhua Shen 沈春华: An overview of recent work in large multimodal models: video generation and perception
时间
2024年11月29日(周五)
10:00-11:30
地点
西湖大学云谷校区E10-211
主持
西湖大学工学院助理教授 张驰
受众
全体师生
分类
学术与研究
西湖工程讲座系列第79期 | Chunhua Shen 沈春华: An overview of recent work in large multimodal models: video generation and perception
时间:2024年11月29日(周五)10:00-11:30
Time:10:00-11:30, Friday, November 29, 2024
地点:西湖大学云谷校区E10-211
Venue:E10-211, Yungu Campus
主持人: 西湖大学工学院助理教授 张驰
Host:Dr. Chi Zhang, Assistant Professor, Westlake University
语言:英文
Language:English
主讲嘉宾/Speaker:
Prof. Chunhua Shen 沈春华
Qiushi Chair Professor
School of Computer Science and Technology
Zhejiang University
主讲人简介/Biography:
Chunhua Shen is a Chair Professor at the College of Computer Science & Technology, Zhejiang University, a position he has held since 2022. Prior to this, he held various roles in Australia from 2002 to 2021, including: Principled Applied Scientist at Amazon Australia; Full Professor at the Australian Institute for Machine Learning, The University of Adelaide; Adjunct Professor at Monash University; Researcher at Australian Centre for Robotic Vision; NICTA (National ICT Australia); and Australian National University.
His research focuses on the intersection of computer vision and statistical machine learning. Professor Shen holds a PhD from The University of Adelaide and has also studied at Australian National University (MPhil), and Nanjing University, China (BSc and MSc). Notable awards and honors include: Australian Research Council Future Fellowship (2012), and Distinguished Professorship of the Changjiang Scholars Programme (2021). His Google Scholar citation count is ~80,000 with an H-index of 128.
讲座摘要/Abstract:
In this talk, I will give an overview of some of my recent work in the area of large multimodal models. In particular, I am interested in video generation and multi-modality perception. We propose a novel hierarchical framework that integrates the strengths of autoregressive models with diffusion-based rendering to pioneer long-duration video generation with intricate plot progressions and high visual fidelity. Second, we propose a method termed Framer for interactive frame interpolation, which targets producing smoothly transitioning frames between two images as per user creativity. I will also briefly some relevant work we did in multi-modal understanding.
讲座联系人/Contact:
符丁文
fudingwen@westlake.edu.cn