Yu Sun
I am a Machine Learning Scientist at Meshcapade. Previously, I was a doctoral student at Harbin Institute of Technology (HIT) and an intern at JDAI CV lab in China.
I am focusing on the research of monocular 3D human motion estimation.
At Meshcapade, I work with many talents in ML team, which is led by
Naureen Mahmood,
Talha Zaman,
Michael J. Black,
and Nicolas Heron.
At HIT, I am supervised by Wenpeng Gao
and Yili Fu.
At JDAI, I have been supervised by
Qian Bao,
Wu Liu,
Yun Ye,
Xiaolei Lv,
and Tao Mei.
In exploring this field, it's an honor to work with
Michael J. Black.
Email  / 
Google Scholar  / 
Twitter  / 
Github
|
|
|
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
Yu Sun,
Qian Bao,
Wu Liu,
Tao Mei,
Michael J. Black
CVPR, 2023  
project page
/
arXiv
/
code
/
dataset
/
video
/
bib
With a holistic 5D representation, TRACE tracks the subjects presented in the first frame through time and recovers their 3D trajectories in global coordinates. It does so in one shot.
|
|
Putting People in their Place: Monocular Regression of 3D People in Depth
Yu Sun,
Wu Liu,
Qian Bao,
Yili Fu,
Tao Mei,
Michael J. Black
CVPR, 2022  
project page
/
arXiv
/
code
/
dataset
/
video
/
bib
BEV adopts an imaginary Bird's-Eye-View representation to explicitly reason about depth relationships between people.
|
|
Monocular, One-stage, Regression of Multiple 3D People
Yu Sun,
Qian Bao,
Wu Liu,
Yili Fu,
Michael J. Black,
Tao Mei
ICCV, 2021  
arXiv
/
code
/
video
/
bib
ROMP is the first one-stage method for monocular regression of multiple 3D people. It can run at over 20 FPS on a 1070Ti GPU.
|
|
Learning Monocular Regression of 3D People in Crowds via Scene-Aware Blending and De-Occlusion
Yu Sun,
Lubing Xu,
Qian Bao,
Wu Liu,
Wenpeng Gao,
Yili Fu,
T-MM, 2023  
paper
/
bib
|
|
Learning Monocular Mesh Recovery of Multiple Body Parts Via Synthesis
Yu Sun,
Tianyu Huang,
Qian Bao,
Wu Liu,
Wenpeng Gao,
Yili Fu,
ICASSP, 2022  
paper
/
bib
|
|
Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation
Yu Sun,
Yun Ye,
Wu Liu,
Wenpeng Gao,
Yili Fu,
Tao Mei
ICCV, 2019  
arXiv
/
code
/
video results
/
bib
DSD-SATN attempts to disentangle the pose-related features from the others (e.g. shape, camera) and learn temporal dynamics via sorting shuffled frames.
|
|
PromptHMR: Promptable Human Mesh Recovery
Yufu Wang,
Yu Sun,
Priyanka Patel,
Kostas Daniilidis,
Michael J. Black,
Muhammed Kocabas
CVPR, 2025  
project page
/
arXiv
/
code
/
video
PromptHMR adopts multi-modal prompts, e.g. 2D detection, segmentation, text, etc, to promote the HPS estimation.
|
|
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi*,
Yu Sun*,
Priyanka Patel,
Yao Feng,
Michael J. Black
CVPR, 2024  
project page
/
arXiv
/
code
/
video
TokenHMR adopts a discrete-token-based representation for SMPL-based regression.
|
|
ChatPose: Chatting about 3D Human Pose
Yao Feng,
Jing Lin,
Sai Kumar Dwivedi,
Yu Sun,
Priyanka Patel,
Michael J. Black
CVPR, 2024  
project page
/
arXiv
/
code
ChatPose adopts LLMs to reason about SMPL-based regression from images or textual descriptions.
|
|
WOC: A Handy Webcam-based 3D Online Chatroom
Chuanhang Yan*,
Yu Sun*,
Qian Bao,
Jinhui Pang,
Wu Liu,
Tao Mei
* equal contribution.
MM demo, 2022  
arXiv
/
bib
WOC captures the 3D motion of users with a single camera and drives their individual 3D virtual avatars in real-time in an online chatroom.
|
|
Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
Wu Liu,
Qian Bao,
Yu Sun,
Tao Mei
ACM Computing Surveys (CSUR), 2022  
arXiv
/
bib
A survey about monocular 2D and 3D human pose estimation.
|
|