[1]刘天亮,冯希龙,顾雁秋,等.一种由粗至精的RGB-D室内场景语义分割方法[J].东南大学学报(自然科学版),2016,46(4):681-687.[doi:10.3969/j.issn.1001-0505.2016.04.002]
 Liu Tianliang,Feng Xilong,Gu Yanqiu,et al.Coarse-to-Fine semantic parsing method for RGB-D indoor scenes[J].Journal of Southeast University (Natural Science Edition),2016,46(4):681-687.[doi:10.3969/j.issn.1001-0505.2016.04.002]
点击复制

一种由粗至精的RGB-D室内场景语义分割方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
46
期数:
2016年第4期
页码:
681-687
栏目:
计算机科学与工程
出版日期:
2016-07-20

文章信息/Info

Title:
Coarse-to-Fine semantic parsing method for RGB-D indoor scenes
作者:
刘天亮1冯希龙1顾雁秋1戴修斌1罗杰波2
1南京邮电大学江苏省图像处理与图像通信重点实验室, 南京 210003; 2罗彻斯特大学计算机科学系, 美国罗彻斯特 14627
Author(s):
Liu Tianliang1 Feng Xilong1 Gu Yanqiu1 Dai Xiubin1 Luo Jiebo2
1Jiangsu Provincial Key Laboratory of Image Processing and Image Communication, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
2Department of Computer Science, University of Rochester, Rochester 14627, USA
关键词:
RGB-D室内场景 语义分割 SLIC过分割 稠密CRFs 递归式反馈
Keywords:
RGB-D indoor scene semantic parsing simple linear iterative clustering(SLIC)segmentation dense conditional random fields(CRFs) recursive feedback
分类号:
TP391
DOI:
10.3969/j.issn.1001-0505.2016.04.002
摘要:
为了标注室内场景中可见物体,提出一种基于RGB-D数据由粗至精的室内场景语义分割方法.首先,利用分层显著度导引的简单线性迭代聚类过分割和鲁棒多模态区域特征,构建面向语义类别的超像素区域池,基于随机决策森林分类器判决各个超像素区域的语义类别,实现粗粒度区域级语义标签推断.然后,为了改善粗粒度级的语义标签,利用几何深度导引和内部反馈机制改进像素级稠密全连接条件随机场模型,以求精细粒度像素级语义标注.最后,在粗、细粒度语义标注之间引入全局递归式反馈,渐进式迭代更新室内场景的语义类别标签.2个公开的RGB-D室内场景数据集上的实验结果表明,与其他方法相比,所提出的语义分割方法无论在主观还是客观评估上,均具有较好的效果.
Abstract:
A coarse-to-fine semantic segmentation method based on RGB-D information was proposed to label the visually meaningful components in indoor scenes. First, to complete coarse-grained region-level semantic label inference, the superpixel region pools for the semantic categories were constructed using hierarchical saliency-guided simple linear iterative clustering(SLIC)segmentation and robust multi-modal regional features, and the semantic category of each superpixel region can be judged based on random decision forest classifer. Then, to adjust coarse-grained semantic tag, a depth-guided pixel-wise fully-connected conditional random field model with an internal recursive feedback was presented to refine fine-grained pixel-level semantic label. Finally, a progressive global recursive feedback mechanism between coarse-grained and fine-grained semantic labels was introduced to iteratively update semantic tags of the predefined superpixel region in the given scenes. Experimental results show that the presented method can achieve comparable performance on the subjective and objective evaluations compared with other state-of-the-art methods on two public RGB-D indoor scene datasets.

参考文献/References:

[1] Krhenbühl P, Koltun V. Efficient inference in fully connected CRFs with gaussian edge potentials [C]//25th Annual Conference on Neural Information Processing Systems. Granada, Span, 2011:109-117.
[2] Ren X, Bo L, Fox D. RGB-(D)scene labeling: Features and algorithms [C]//IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA, 2012:2759-2766.
[3] Silberman N, Hoiem D, Kohli P, et al. Indoor segmentation and support inference from RGBD images [C]//12th European Conference on Computer Vision. Firenze, Italy, 2012:746-760. DOI:10.1007/978-3-642-33715-4-54.
[4] Silberman N, Fergus R. Indoor scene segmentation using a structured light sensor [C]//IEEE International Conference on Computer Vision Workshops. Barcelona, Spain, 2011:601-608.
[5] Stasse O, Dupitier S, Yokoi K. 3D object recognition using spin-images for a humanoid stereoscopic vision system [C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Beijing, China, 2006:2955-2960. DOI:10.1109/iros.2006.282151.
[6] Lowe D G. Distinctive image features from scales-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2):91-110. DOI:10.1023/b:visi.0000029664.99615.94.
[7] Couprie C, Farabet C, Najman L, et al. Indoor semantic segmentation using depth information [C]//International Conference on Learning Representation. Scottsdale, AZ, USA, 2013:1-8.
[8] Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11):2274-2282. DOI:10.1109/tpami.2012.120.
[9] Yan Q, Xu L, Shi J, et al. Hierarchical saliency detection [C]//IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA, 2013:1155-1162.
[10] Rusu R B, Cousins S. 3D is here: Point cloud library [C]//Proceedings IEEE International Conference on Robotics and Automation. Shanghai, China, 2011:1-4.
[11] Stückler J, Waldvogel B, Schulz H, et al. Dense real-time mapping of object-class semantics from RGB-D video [J]. Journal of Real-Time Image Processing, 2013, 10(4):599-609. DOI:10.1007/s11554-013-0379-5.
[12] Xiao J X, Owens A, Torralba A. SUN3D: A database of big spaces reconstructed using SfM and object labels [C]//14th IEEE International Conference on Computer Vision. Sydney, Australia, 2013:1625-1632. DOI:10.1109/iccv.2013.458.
[13] Waldvogel B. Accelerating random forests on CPUs and GPUs for object-class image segmentation [D]. Bonn, German: Bonn University, 2013.
[14] Gupta S, Arbelaez P, Malik J. Perceptual organization and recognition of indoor scenes from RGB-D images [C]//IEEE Conference on Computer Vision and Pattern Recognition. Portland, Oregon, 2013:564-571. DOI:10.1109/cvpr.2013.79.

备注/Memo

备注/Memo:
收稿日期: 2015-12-07.
作者简介: 刘天亮(1980—),男,博士,副教授,liutl@njupt.edu.cn.
基金项目: 国家自然科学基金资助项目(31200747,61001152,61071091,61071166,61172118)、江苏省自然科学基金资助项目(BK2010523,BK2012437)、南京邮电大学校级科研基金资助项目(NY210069,NY214037)、国家留学基金资助项目、教育部互联网应用创新开放平台示范基地(气象云平台及应用)资助项目(KJRP1407).
引用本文: 刘天亮,冯希龙,顾雁秋,等.一种由粗至精的RGB-D室内场景语义分割方法[J].东南大学学报(自然科学版),2016,46(4):681-687. DOI:10.3969/j.issn.1001-0505.2016.04.002.
更新日期/Last Update: 2016-07-20