
讲座名称: 钱学森大讲堂第85期—微软AI讲堂
讲座时间: 2019-04-15
讲座人: 罗翀 /代季峰/David Wipf
校区: 兴庆校区
讲座内容: 钱学森大讲堂第85期—微软AI讲堂 讲座一: 讲座题目:视觉物体跟踪及相关研究 讲座时间:2019年4月15日  晚7时 讲座地点:主楼B203 讲座人:罗翀—微软亚洲研究院主管研究员 讲座内容: In this lecture, I will mainly talk about the status and prospect of visual object tracking (VOT), which is one of the most fundamental and challenging tasks in computer vision. VOT finds numerous applications in surveillance, autonomous systems, and augmented reality. In the past three years, deep learning has significantly advanced the state-of-the-art of VOT. We have also carried out several pieces of tracking work based on Siamese convolutional neural network. Two papers were accepted by CVPR and one tracker won the second place in the VOT-2018 real-time tracking challenge. At the end of the lecture, I will introduce our recent efforts on multimodality video analysis, which we believe is the future of video understanding. 讲座二: 讲座题目:视觉中的几何形变建模 讲座时间:2019年4月15日  晚7时 讲座地点:主楼B203 讲座人:代季峰—微软亚洲研究院研究经理 讲座内容: 在视觉识别任务中,一个重要的挑战是如何恰当处理和建模几何形变,包括尺度、姿态、视角以及物体部件的移动等。从特征工程的时代开始,一系列著名的算法就被开发出来以尝试解决这个问题,包括SIFT,DPM等。但受限于它们的特征表达能力和局限的变性建模能力,其性能受到了很多的限制。在深度学习的时代,网络特征的表达能力大大的超出了之前手工设计的特征。但是,现有的网络模块依然难以对几何形变进行有效的处理和建模。本次talk中将会介绍在深度神经网络中的几何形变建模技术,它们能够大幅度的增强深度神经网络的几何建模能力,在各种识别任务中取得巨大的性能提升。 讲座三: 讲座题目:Diagnosing and Enhancing VAE Models 讲座时间:2019年4月15日  晚7时 讲座地点:主楼B203 讲座人:David Wipf—微软亚洲研究院主管研究员 讲座内容: Although variational autoencoders (VAEs) represent a widely influential deep generative model, many aspects of the underlying energy function remain poorly understood.  In particular, it is commonly believed that Gaussian encoder/decoder assumptions reduce the effectiveness of VAEs in generating realistic samples.  In this regard, we rigorously analyze the VAE objective, differentiating situations where this belief is and is not actually true.  We then leverage the corresponding insights to develop a simple VAE enhancement that requires no additional hyperparameters or sensitive tuning.  Quantitatively, this proposal produces crisp samples and stable FID scores that are actually competitive with a variety of GAN models, all while retaining desirable attributes of the original VAE architecture.