 Chen Mengting,Wang Xinggang,Liu Wenyu,et al.Dense Depth Interpolation for 3D Human Pose Estimation[J].Journal of Zhengzhou University (Engineering Science),2021,42(03):26.[doi:10.13705/j.issn.1671-6833.2021.03.005]





Dense Depth Interpolation for 3D Human Pose Estimation
陈梦婷 王兴刚 刘文予
Chen Mengting; Wang Xinggang; Liu Wenyu;
School of Electronic Information and Communication of Huazhong University of Science and Technology;
3D vision human pose estimation dense depth interpolation cross-domain generalization
The 3D human pose estimation is a challenging task in computer vision. Due to the difficulty of annotation, only some disperse key-point data form limited scenes are available, which makes 3D prediction a big challenge. In this paper, the human body is deemed as a flexible structure, but a specific limb can be viewed as a rigid-body. Given depths of two points on both ends, the depths of the whole limb can be estimated by dense interpretation. Therefore, this paper proposes a method that can take the dense depth interpretation feature map as middle supervision. It provides a denser and more structured target, instead of regression for disperse key-points directly. The MPJPG on Human3.6M reaches 50.9 mm with only a simple network structure. The cross-domain experiments on dataset MPI-INF-3DHP further show the generalization ability of the proposed method.


[1] 杨忠明, 李子龙, 胡音文, 等. 一种前景提取的行人模式识别检测算法[J]. 郑州大学学报(工学版), 2019, 40(5):91-96.

[2] IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(7): 1325-1339.
[3] ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014:3686-3693.
[4] MEHTA D, RHODIN H, CASAS D, et al. Monocular 3D human pose estimation in the wild using improved CNN supervision[C]// 7th IEEE International Conference on 3D Vision, 3DV. Piscataway: IEEE, 2017:506-516.
[5] PISHCHULIN L, ANDRILUKA M, GEHLER P, et al. Poselet conditioned pictorial structures[C]// 26th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013:588-595.
[6] YANG Y, RAMANAN D. Articulated pose estimation with flexible mixtures-of-parts[C] // Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2011: 1385-1392.
[7] FERRARI V,MARwidth=4,height=11,dpi=110N-JIMÉNEZ M,ZISSERMAN A.2D human pose estimation in TV shows[J]. Statistical and geometrical approaches to visual motion analysis, 2009, 5064:128-147.
[8] TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2014:1653-1660.
[9] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.
[10] NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[C] //European Conference on Computer Vision. Berlin: Springer, 2016:483-499.
[11] WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4724-4732.
[12] CARREIRA J, AGRAWAL P, FRAGKIADAKI K, et al. Human pose estimation with iterative error feedback[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4733-4742.
[13] LEE H J, CHEN Z. Determination of 3D human body postures from a single view[J]. Computer vision, graphics, and image processing, 1985, 30(2): 148-168.
[14] GUPTA A, MARTINEZ J, LITTLE J J, et al. 3D pose from motion for cross-view action recognition via non-linear circulant temporal encoding[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE,2014: 2601-2608.
[15] ROGEZ G, RIHAN J, RAMALINGAM S, et al. Randomized trees for human pose detection[C] //2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2008:1-8.
[16] PAVLAKOS G,ZHOU X W,DERPANIS K G,et al. Coarse-to-fine volumetric prediction for single-image 3D human pose[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE,2017:1263-1272.
[17] YANG W,OUYANG W L,WANG X L,et al.3D human pose estimation in the wild by adversarial learning[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018:5255-5264.
[18] ZHOU X W, ZHU M, PAVLAKOS G, et al. MonoCap: monocular human motion capture using a CNN coupled with a geometric prior[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(4): 901-914.
[19] TOME D,RUSSELL C,AGAPITO L.Lifting from the deep: convolutional 3D pose estimation from a single image[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:5689-5698.
[20] MARTINEZ J,HOSSAIN R,ROMERO J,et al. A simple yet effective baseline for 3D human pose estimation[C] //Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2640-2649.
[21] FANG H S,XU Y L,WANG W G,et al. Learning pose grammar to encode human body configuration for 3D pose estimation[EB/OL].(2017-10-17)[2020-10-30]. https://arxiv.org/abs/1710.06513.
[22] CHEN C H, RAMANAN D. 3D human pose estimation=2D pose estimation+ matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5759-5767.
[23] ZHOU X, HUANG Q, SUN X, et al. Towards 3D human pose estimation in the wild: a weakly-supervised approach[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:398-407.
[24] WANG J, HUANG S L, WANG X C, et al. Not all parts are created equal: 3D pose estimation by modeling bi-directional dependencies of body parts[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE,2019: 7771-7780.
[25] LEE K,LEE I,LEE S.Propagating LSTM:3D pose estimation based on joint interdependency[C]// European Conference on Computer Vision-ECCV 2018. Berlin: Springer, 2018:123-141.

更新日期/Last Update: 2021-06-24