The CVNext lab focuses on advancing general-purpose embodied intelligence, building upon foundations in long video understanding and reasoning in dynamic, complex scenes. The core objective is to develop open, adaptive embodied agents that tightly integrate environment perception, interactive reasoning, and personalized adaptation and decision-making. Ultimately, the research aims to establish both theoretical frameworks and practical systems for general and domain-specific embodied agents, contributing to scalable, transferable, and real-world embodied AI. Our main research directions include:
- Interactive 3D Scene Reconstruction and Generation
- Unified World-Reasoning-Action Modeling for Embodied Agents
- Personalized Adaptation with Active Perception
Professor
Gaoang Wang [Web]
Assistant Professor
Office: C417, ZJUI Building
Email: gaoangwang@intl.zju.edu.cn
Research Interests:
- Visual Perception
- Transfer Learning
- Spatial Intelligence
- Embodied Intelligence
News:
[Nov. 2025] One paper was accepted by IJCV, 2025.
[Nov. 2025] Three papers were accepted by AAAI 2026, including one oral paper.
[Oct. 2025] We got an outstanding paper award in ICCV KnowledgeMR workshop.
[Aug. 2025] One paper was accepted by TPAMI, 2025.
[Jul. 2025] One paper was accepted by ECAI 2025.
[Jul. 2025] One paper was accepted by ICCV Findings Workshop, 2025.
[Jun. 2025] One paper was accepted by TIP, 2025.
[Jun. 2025] One paper was accepted by ICCV 2025.
[May 2025] One paper was accepted by Information Fusion, 2025.
[May 2025] One paper was accepted by ICML 2025.
[Apr. 2025] One paper was accepted by CVPR Workshop on Urban Scene Modeling, 2025.
[Mar. 2025] One paper was accepted by TVCG, 2025.
[Feb. 2025] One paper was accepted by TCSVT, 2025.
[Feb. 2025] One paper was accepted by CVPR 2025.
[Jan. 2025] One paper was accepted by MIA, 2025.
[Jan. 2025] One paper was accepted by TMM, 2025.
[Dec. 2024] Two papers were accepted by ICASSP 2025.
[Dec. 2024] One paper was accepted by AAAI 2025.
[Sep. 2024] One paper was accepted by NeurIPS 2024.
[Jul. 2024] One paper was accepted by MICCAI Workshop on Deep Generative Models, 2024.
[Jun. 2024] Two papers were accepted by ACM MM 2024.
[Jun. 2024] One paper was accepted by ECCV 2024.
[Jun. 2024] One paper was accepted by PRCV 2024.
[Apr. 2024] One paper was accepted by TMM, 2024.
[Mar. 2024] "Long-term Video Question Answering Competition (LOVEU@CVPR'24 Track 1)" was released. More details can be found here.
[Mar. 2024] One paper was accepted by ICLR Workshop on LLM Agents, 2024.
[Feb. 2024] Three papers were accepted by CVPR 2024.
[Dec. 2023] Two papers were accepted by ICASSP 2024.
[Dec. 2023] Two papers were accepted by AAAI 2024.
[Dec. 2023] One paper was accepted by Neurocomputing, 2023.
[Sep. 2023] One paper was accepted by IJCV, 2023.
[Sep. 2023] One paper was accepted by TMM, 2023.
[Aug. 2023] One paper was accepted by PRCV 2023.
[Jul. 2023] Three papers were accepted by ICCV 2023.
[Jun. 2023] One paper was accepted by MICCAI 2023.
[May 2023] One paper was accepted by Findings of ACL 2023.
[Apr. 2023] Two papers were accepted by IJCAI 2023.
[Apr. 2023] One paper was accepted by CVPR workshop, Computer Vision for Fashion, Art, and Design, 2023.
[Mar. 2023] One paper was accepted by ICME 2023.
[Mar. 2023] One paper was accepted by ICASSP 2023.
[Feb. 2023] One paper was accepted by CVPR 2023.
[Feb. 2023] One paper was accepted by TAI, 2023.
[Nov. 2022] One paper was accepted by TMI, 2022.
[Jul. 2022] One paper was accepted by ECCV 2022.
[Apr. 2022] One paper was accepted by CVPR workshop, the 2nd Workshop on Sketch-Oriented Deep Learning, 2022.
[Mar. 2022] One paper was accepted by ICME 2022.
[Jan. 2022] One paper was accepted by TMM, 2022.
[Aug. 2021] One paper was accepted by CVIU, 2021.
[Jul. 2021] One paper was accepted by ICCV 2021.
[Apr. 2021] One paper was accepted by CVPR workshop, the Workshop on Autonomous Driving, 2021.
[Jan. 2021] ROD2021 Challenge @ICMR 2021 was released.
Ph.D. Students
Shengyu Hao
shengyuhao@zju.edu.cn
Multi-object Tracking
Representation Learning
Domain Adaptation
guanhongwang@zju.edu.cn
Multi-modality Learning
Video Understanding
Vision and Language
whu@zju.edu.cn
3D Vision
Generative Models
Anomaly Detection
Zhonghan Zhao
zhaozhonghan@zju.edu.cn
Embodied AI
Reinforcement Learning
Incontext Learning
Xiaoyue Li
(Main Advisor: Mark Butala)
xiaoyue98@zju.edu.cn
Image Generation
Image Reconstruction
Medical Image Inverse Problems
Chenlu Zhan
(Main Advisor: Hongwei Wang)
chenlu.22@intl.zju.edu.cn
Medical Vision Language
Medical Multimodality
Visual-Language Pretraining
Wendi Hu
3200105651@zju.edu.cn
Multi-object Tracking
Kewei Wei
3200104125@zju.edu.cn
Multimodality Learning
Ke Ma
(Main Advisor: Hongwei Wang)
kema@berkeley.edu
Generative Models
3D Vision
Tielong Cai
tielong.22@intl.zju.edu.cn
Generative model
Embodied AI
Master Students
Shidong Cao
shidong.22@intl.zju.edu.cn
Generative Models
Multi-modality Learning
Graph Machine Learning
Yichen Ouyang [Web]
22271110@zju.edu.cn
Generative Models
3D Vision
Multi-modality Learning
Meiqi Sun
meiqi.22@intl.zju.edu.cn
Animal Action Recognition
Animal Pose Estimation
Xuechen Guo
xuechen.22@intl.zju.edu.cn
Computer Vision
Multi-modality Learning
Jianshu Guo
jianshu.22@intl.zju.edu.cn
Diffusion Model
Vision Language
Chang Su
changs.19@intl.zju.edu.cn
Smart City
Enxin Song [Web]
enxin.23@intl.zju.edu.cn
Video Understanding
Image Generation
Xuan Wang
xuanw@zju.edu.cn
Multi-modality Learning
Embodied AI
Fang Liang
3D Vision
Image Reconstruction
Dongping Li
dongping.23@intl.zju.edu.cn
Multi-modality Learning
Active Perception
Unified Model
Junsheng Huang
junsheng.24@intl.zju.edu.cn
3D Vision
Multi-modality Learning
Tianci Tang
tianci_tang@tiu.edu.cn
Embodied AI
Diffusion Model
Yizhi Li
yizhi.20@intl.zju.edu.cn
Multi-modality Learning
Computer Vision
Xuexiang Wen
xuexiang.24@intl.zju.edu.cn
Multi-modality Learning
Jiawu Zhang
2540614031@qq.com
Multi-modal logistics large models
Research Assistants
Jie Deng
dengj325@gmail.com
3D Scene Generation
Undergraduate
Yichen Xu
Wenhao Chai (Alumni)[Web]
wchai@uw.edu
Multi-modality Representation
Unified Perception Model
Embodied Intelligence