I am a doctoral student at Harbin Institute of Technology (HIT) in China. I was also an intern at JDAI CV lab.

I am focusing on the research of monocular 3D human pose and shape estimation. At HIT, I am supervised by Wenpeng Gao and Yili Fu. At JDAI, I have been supervised by Qian Bao, Wu Liu, Yun Ye, Xiaolei Lv, and Tao Mei. In exploring this field, it's an honor to work with Michael J. Black.

TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black
CVPR, 2023  
project page / arXiv / code / dataset / video / bib

With a holistic 5D representation, TRACE tracks the subjects presented in the first frame through time and recovers their 3D trajectories in global coordinates. It does so in one shot.

Putting People in their Place: Monocular Regression of 3D People in Depth
Yu Sun, Wu Liu, Qian Bao, Yili Fu, Tao Mei, Michael J. Black
CVPR, 2022  
project page / arXiv / code / dataset / video / bib

BEV adopts an imaginary Bird's-Eye-View representation to explicitly reason about depth relationships between people.

Monocular, One-stage, Regression of Multiple 3D People
Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei
ICCV, 2021  
arXiv / code / video / bib

ROMP is the first one-stage method for monocular regression of multiple 3D people. It can run at over 20 FPS on a 1070Ti GPU.

Learning Monocular Mesh Recovery of Multiple Body Parts Via Synthesis
Yu Sun, Tianyu Huang, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu,
ICASSP, 2022  
paper / bib

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation
Yu Sun, Yun Ye, Wu Liu, Wenpeng Gao, Yili Fu, Tao Mei
ICCV, 2019  
arXiv / code / video results / bib

DSD-SATN attempts to disentangle the pose-related features from the others (e.g. shape, camera) and learn temporal dynamics via sorting shuffled frames.

Collaborative Research
WOC: A Handy Webcam-based 3D Online Chatroom
Chuanhang Yan*, Yu Sun*, Qian Bao, Jinhui Pang, Wu Liu, Tao Mei
* equal contribution.
MM demo, 2022  
arXiv / bib

WOC captures the 3D motion of users with a single camera and drives their individual 3D virtual avatars in real-time in an online chatroom.

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
Wu Liu, Qian Bao, Yu Sun, Tao Mei

ACM Computing Surveys (CSUR), 2022  
arXiv / bib

A survey about monocular 2D and 3D human pose estimation.

