参考文献/References:
[1] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[2] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition [EB/OL].(2015-04-10)[2018-01-23]. https://arxiv.org/abs/1409.1556.
[3] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition [C]// Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA, 2016: 770-778.
[4] Huang G, Liu Z, Weinberger K Q, et al. Densely connected convolutional networks[C]//Computer Vision and Pattern Recognition. Ha-waii, USA, 2017:2261-2269.
[5] Srivastava R K, Greff K, Schmidhuber J,et al.Training very deep networks [C]// Neural Information Processing Systems. Montreal, Canada, 2015: 2377-2385.
[6] Springenberg J T, Dosovitskiy A, Brox T, et al. Striving for simplicity: The all convolutional net[EB/OL].(2015-04-13)[2018-01-23]. https://arxiv.org/abs/1412.6806.
[7] Fei W, Mengqing J, Chen Q, et al. Residual attention network for image classification [EB/OL].(2017)[2018-01-10]. https://arxiv.org/abs/1704.06904.
[8] Hariharan B, Arbeláez P, Girshick R, et al. Hypercolumns for object segmentation and fine-grained localization [C]// Computer Vision and Pattern Recognition. Boston, USA, 2015: 447-456.
[9] Sermanet P, Kavukcuoglu K, Chintala S, et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning [C]// Computer Vision and Pattern Recognition. Oregon, USA, 2013: 3626-3633.
[10] Krizhevsky, A. Learning multiple layers of features from tiny images [EB/OL].(2009)[2017-12-09]. https://www.researchgate.net/publication/265748773_Learning_Multiple_Layers_of_Features_from_Tiny_Images.
[11] Huang G, Sun Y, Liu Z, et al. Deep networks with stochastic depth [C]// European Conference on Computer Vision. Amsterdam, Netherlands, 2016: 646-661.
[12] Lin M, Chen Q, Yan S, et al. Network in network [EB/OL].(2014-03-04)[2017-11-30]. https://arxiv.org/abs/1312.4400.
[13] Romero A, Ballas N, Kahou S E, et al. FitNets: Hints for thin deep nets [EB/OL].(2015)[2018-01-23]. https://arxiv.org/abs/ 1412.6550.
[14] Lee C, Xie S, Gallagher P W, et al. Deeply-supervised nets [C]// International Conference on Artificial Intelligence and Statistics. Reykjavik, Iceland, 2014: 562-570.
[15] Netzer Y, Wang T, Coates A, et al. Reading digits in natural images with unsupervised feature learning[EB/OL].(2011)[2017-12-13].http://ai.sta-nford.edu/~twangcat/papers/nips2011_housenum-bers.pdf.
[16] Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning [C]// International Conference on Machine Learning. Atlanta, USA, 2013: 1139-1147.