2025年度 知能情報学コースコロキウム (IST COLLOQUIUM 2025) 第二回のお知らせ
京都大学大学院情報学研究科知能情報学コースでは、2025年度知能情報学コースコロキウム(IST COLLOQUIUM 2025)を開催しております。
第2回目はNational Tsing Hua University(NTHU、国立清華大学)のShang-Hong Lai教授から、"Advancing Visual Understanding and Generation: Recent Research from NTHU Computer Vision Lab" というタイトルでご講演いただきます。
タイトル:Advancing Visual Understanding and Generation: Recent Research from NTHU Computer Vision Lab
講演者:Prof. Shang-Hong Lai(Dept. of Computer Science, National Tsing Hua University, Taiwan)
日時:11月18日(火) 10:30〜12:00
場所:総合研究7号館セミナー室1(1階 127)
概要:In this talk, I will present some recent research works from the Computer Vision Laboratory at National Tsing Hua University (NTHU). The NTHU CV Lab focuses on several key areas, including video understanding, face-related analysis, anomaly detection, and medical imaging. I will begin with two of our latest advances in video understanding, HERMES and VADER. HERMES introduces two versatile modules that can be seamlessly integrated into existing video–language models or deployed as a standalone framework for long-form video comprehension, achieving state-of-the-art performance across multiple benchmarks. VADER, on the other hand, is an LLM-driven framework for video anomaly reasoning, which combines keyframe-level object-relation modeling with visual contextual cues to enhance anomaly interpretation. Next, I will discuss one of our recent works in anomaly detection, LFQUIAD. LFQUIAD integrates a quantization-driven autoencoder with a modular Anomaly Generation Module to improve representation learning. Finally, I will briefly present two medical imaging projects leveraging diffusion models—one for generating paired 3D CT image–mask datasets, and the other for synthesizing contrast-enhanced 3D CT volumes from non-contrast scans. Through these examples, I will highlight our lab’s ongoing efforts toward building generalizable, interpretable, and efficient computer vision systems bridging visual understanding and generative modeling.
