[1]王健弘,张旭,章品正,等.基于时空信息和非负成分表示的动作识别[J].东南大学学报(自然科学版),2016,46(4):675-680.[doi:10.3969/j.issn.1001-0505.2016.04.001]
 Wang Jianhong,Zhang Xu,Zhang Pinzheng,et al.Action recognition based on spatio-temporal information and nonnegative component representation[J].Journal of Southeast University (Natural Science Edition),2016,46(4):675-680.[doi:10.3969/j.issn.1001-0505.2016.04.001]
点击复制

基于时空信息和非负成分表示的动作识别()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
46
期数:
2016年第4期
页码:
675-680
栏目:
计算机科学与工程
出版日期:
2016-07-20

文章信息/Info

Title:
Action recognition based on spatio-temporal information and nonnegative component representation
作者:
王健弘张旭章品正姜龙玉罗立民
东南大学影像科学与技术实验室, 南京 210096
Author(s):
Wang Jianhong Zhang Xu Zhang Pinzheng Jiang Longyu Luo Limin
Laboratory of Image Science and Technology, Southeast University, Nanjing 210096, China
关键词:
动作识别 非负成分表示 时空Fisher向量 视觉词袋
Keywords:
action recognition nonnegative component representation spatio-temporal Fisher vector bag of visual words
分类号:
TP391.4
DOI:
10.3969/j.issn.1001-0505.2016.04.001
摘要:
为充分利用时空分布信息及视觉单词间的关联信息,提出了一种新的时空-非负成分表示方法(ST-NCR)用于动作识别.首先,基于视觉词袋(BoVW)表示,利用混合高斯模型对每个视觉单词所包含的局部特征的时空位置分布进行建模,计算时空Fisher向量(STFV)来描述特征位置的时空分布;然后,利用非负矩阵分解从BoVW表示中学习动作基元并对动作视频进行编码.为有效融合时空信息,采用基于图正则化的非负矩阵分解,并且将STFV作为图正则化项的一部分.在3个公共数据库上对该方法进行了测试,结果表明,相比于BoVW表示和不带时空信息的非负成分表示方法,该方法能够提高动作识别率.
Abstract:
To make full use of spatial-temporal information and the relationship among different visual words, a novel spatial-temporal nonnegative component representation method(ST-NCR)is proposed for action recognition. First, based on BoVW(bag of visual words)representation, the locations of local features belonging to each visual word are modeled with the Gaussian mixture model, and a spatio-temporal Fisher vector(STFV)is calculated to describe the location distribution of local features. Then, nonnegative matrix factorization(NMF)is employed to learn the action components and encode the action video samples. To incorporate the spatial-temporal cues for final representation, the graph regularized NMF(GNMF)is adopted, and STFV is used as part of graph regularization. The proposed method is extensively evaluated on three public datasets. Experimental results demonstrate that compared with BoVW representation and nonnegative component representation without spatio-temporal information, the method can obtain better action recognition accuracy.

参考文献/References:

[1] Turaga P, Chellappa R, Subrahmanian V S, et al. Machine recognition of human activities: A survey[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(11): 1473-1488. DOI:10.1109/tcsvt.2008.2005594.
[2] Laptev I, Marszalek M, Schmid C, et al. Learning realistic human actions from movies[C]//IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA, 2008: 1-8. DOI:10.1109/cvpr.2008.4587756.
[3] Klaser A, Marszalek M, Schmid C. A spatio-temporal descriptor based on 3d-gradients[C]//19th British Machine Vision Conference. Leeds, UK, 2008: 995-1004. DOI:10.5244/c.22.99.
[4] Solmaz B, Assari S M, Shah M. Classifying web videos using a global video descriptor[J]. Machine Vision and Applications, 2012, 24(7): 1473-1485. DOI:10.1007/s00138-012-0449-x.
[5] Wang H, Kläser A, Schmid C, et al. Dense trajectories and motion boundary descriptors for action recognition[J]. International Journal of Computer Vision, 2013, 103(1): 60-79. DOI: 10.1007/s11263-012-0594-8.
[6] Liu L Q, Wang L, Liu X W. In defense of soft-assignment coding[C]//IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 2486-2493.
[7] Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparse coding for image classification[C]//IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 1794-1801.
[8] Wang J, Yang J, Yu K, et al. Locality-constrained linear coding for image classification[C]//IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA, 2010: 3360-3367. DOI:10.1109/cvpr.2010.5540018.
[9] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization[J]. Nature, 1999, 401(6755): 788-791. DOI:10.1038/44565.
[10] Cai D, He X, Han J, et al. Graph regularized nonnegative matrix factorization for data representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1548-1560. DOI:10.1109/TPAMI.2010.231.
[11] Schuldt C, Laptev I, Caputo B. Recognizing human actions: A local SVM approach[C]//Proceedings of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004: 32-36. DOI:10.1109/icpr.2004.1334462.
[12] Liu J, Luo J, Shah M. Recognizing realistic actions from videos “in the wild”[C]//IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 1996-2003.
[13] Kuehne H, Jhuang H, Garrote E, et al. HMDB: A large video database for human motion recognition[C]//IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 2556-2563. DOI:10.1109/iccv.2011.6126543.
[14] Wang H, Kläser A, Schmid C, et al. Dense trajectories and motion boundary descriptors for action recognition[J]. International Journal of Computer Vision, 2013, 103(1): 60-79. DOI:10.1007/s11263-012-0594-8.
[15] Sadanand S, Corso J J. Action bank: A high-level representation of activity in video[C]//IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 1234-1241.
[16] Le Q V, Zou W Y, Yeung S Y, et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis[C]//IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2011: 3361-3368.
[17] Wang H, Yuan C, Hu W, et al. Action recognition using nonnegative action component representation and sparse basis selection[J]. IEEE Transactions on Image Processing, 2014, 23(2): 570-581. DOI:10.1109/tip.2013.2292550.
[18] Yang X, Tian Y. Action recognition using super sparse coding vector with spatio-temporal awareness[C]//13th European Conference on Computer Vision. Zurich, Switzerland, 2014: 727-741. DOI:10.1007/978-3-319-10605-2_47.

相似文献/References:

[1]胡斐,罗立民,刘佳,等.基于时空兴趣点和主题模型的动作识别[J].东南大学学报(自然科学版),2011,41(5):962.[doi:10.3969/j.issn.1001-0505.2011.05.013]
 Hu Fei,Luo Limin,Liu Jia,et al.Action recognition based on space-time interest points and topic model[J].Journal of Southeast University (Natural Science Edition),2011,41(4):962.[doi:10.3969/j.issn.1001-0505.2011.05.013]
[2]李亚玮,金立左,孙长银,等.基于光流约束自编码器的动作识别[J].东南大学学报(自然科学版),2017,47(4):691.[doi:10.3969/j.issn.1001-0505.2017.04.011]
 Li Yawei,Jin Lizuo,Sun Changyin,et al.Action recognition based on optical flow constrained auto-encoder[J].Journal of Southeast University (Natural Science Edition),2017,47(4):691.[doi:10.3969/j.issn.1001-0505.2017.04.011]

备注/Memo

备注/Memo:
收稿日期: 2016-02-24.
作者简介: 王健弘(1984—),男,博士生;罗立民(联系人),男,博士,教授,博士生导师,luo.list@seu.edu.cn.
基金项目: 国家自然科学基金青年科学基金资助项目(61401085)、教育部留学归国人员科研启动基金资助项目(2015).
引用本文: 王健弘,张旭,章品正,等.基于时空信息和非负成分表示的动作识别[J].东南大学学报(自然科学版),2016,46(4):675-680. DOI:10.3969/j.issn.1001-0505.2016.04.001.
更新日期/Last Update: 2016-07-20