Yu Sun

I am a Machine Learning Scientist at Meshcapade. Previously, I was a doctoral student at Harbin Institute of Technology (HIT) and an intern at JDAI CV lab in China.

I am focusing on the research of monocular 3D human motion estimation. At Meshcapade, I work with many talents in ML team, which is led by Naureen Mahmood, Talha Zaman, Michael J. Black, and Nicolas Heron.

At HIT, I am supervised by Wenpeng Gao and Yili Fu. At JDAI, I have been supervised by Qian Bao, Wu Liu, Yun Ye, Xiaolei Lv, and Tao Mei. In exploring this field, it's an honor to work with Michael J. Black.

Email  /  Google Scholar  /  Twitter  /  Github

profile photo
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black
CVPR, 2023  
project page / arXiv / code / dataset / video / bib

With a holistic 5D representation, TRACE tracks the subjects presented in the first frame through time and recovers their 3D trajectories in global coordinates. It does so in one shot.

Putting People in their Place: Monocular Regression of 3D People in Depth
Yu Sun, Wu Liu, Qian Bao, Yili Fu, Tao Mei, Michael J. Black
CVPR, 2022  
project page / arXiv / code / dataset / video / bib

BEV adopts an imaginary Bird's-Eye-View representation to explicitly reason about depth relationships between people.

Monocular, One-stage, Regression of Multiple 3D People
Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei
ICCV, 2021  
arXiv / code / video / bib

ROMP is the first one-stage method for monocular regression of multiple 3D people. It can run at over 20 FPS on a 1070Ti GPU.

Learning Monocular Regression of 3D People in Crowds via Scene-Aware Blending and De-Occlusion
Yu Sun, Lubing Xu, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu,
T-MM, 2023  
paper / bib

Learning Monocular Mesh Recovery of Multiple Body Parts Via Synthesis
Yu Sun, Tianyu Huang, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu,
ICASSP, 2022  
paper / bib

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation
Yu Sun, Yun Ye, Wu Liu, Wenpeng Gao, Yili Fu, Tao Mei
ICCV, 2019  
arXiv / code / video results / bib

DSD-SATN attempts to disentangle the pose-related features from the others (e.g. shape, camera) and learn temporal dynamics via sorting shuffled frames.

Collaborative Research
PromptHMR: Promptable Human Mesh Recovery
Yufu Wang, Yu Sun, Priyanka Patel, Kostas Daniilidis, Michael J. Black, Muhammed Kocabas
CVPR, 2025  
project page / arXiv / code / video

PromptHMR adopts multi-modal prompts, e.g. 2D detection, segmentation, text, etc, to promote the HPS estimation.

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi*, Yu Sun*, Priyanka Patel, Yao Feng, Michael J. Black
CVPR, 2024  
project page / arXiv / code / video

TokenHMR adopts a discrete-token-based representation for SMPL-based regression.

ChatPose: Chatting about 3D Human Pose
Yao Feng, Jing Lin, Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Michael J. Black
CVPR, 2024  
project page / arXiv / code

ChatPose adopts LLMs to reason about SMPL-based regression from images or textual descriptions.

WOC: A Handy Webcam-based 3D Online Chatroom
Chuanhang Yan*, Yu Sun*, Qian Bao, Jinhui Pang, Wu Liu, Tao Mei
* equal contribution.
MM demo, 2022  
arXiv / bib

WOC captures the 3D motion of users with a single camera and drives their individual 3D virtual avatars in real-time in an online chatroom.

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
Wu Liu, Qian Bao, Yu Sun, Tao Mei

ACM Computing Surveys (CSUR), 2022  
arXiv / bib

A survey about monocular 2D and 3D human pose estimation.


The template is borrowed from this awesome website