[1]程光,陈玉祥.基于支持向量机的加密流量识别方法[J].东南大学学报(自然科学版),2017,47(4):655-659.[doi:10.3969/j.issn.1001-0505.2017.04.005]
 Cheng Guang,Chen Yuxiang.Identification method of encrypted traffic based on support vector machine[J].Journal of Southeast University (Natural Science Edition),2017,47(4):655-659.[doi:10.3969/j.issn.1001-0505.2017.04.005]
点击复制

基于支持向量机的加密流量识别方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
47
期数:
2017年第4期
页码:
655-659
栏目:
计算机科学与工程
出版日期:
2017-07-20

文章信息/Info

Title:
Identification method of encrypted traffic based on support vector machine
作者:
程光陈玉祥
东南大学计算机科学与工程学院, 南京 211189; 东南大学教育部计算机网络与信息集成重点实验室, 南京 211189
Author(s):
Cheng Guang Chen Yuxiang
School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
Key Laboratory of Computer Network and Information Integration of Ministry of Education, Southeast University, Nanjing 211189, China
关键词:
加密流量识别 相对熵 蒙特卡洛仿真 支持向量机
Keywords:
encrypted traffic identification relative entropy Monte Carlo simulation support vector machine
分类号:
TP393.4
DOI:
10.3969/j.issn.1001-0505.2017.04.005
摘要:
针对现有的加密流量识别方法难以区分加密流量和非加密压缩文件流量的问题,对互联网中的加密流量、txt流量、doc流量、jpg流量和压缩文件流量进行分析,发现基于信息熵的方法能够有效地将低熵值数据流和高熵值数据流区分开.但该方法不能识别每个字节是随机的而全部流量是伪随机的非加密压缩文件流量,因此采用相对熵特征向量{h0,h1,h2,h3}区分低熵值数据流和高熵值数据流,采用蒙特卡洛仿真方法估计π值的误差perror来区分局部随机流量和整体随机流量.最终提出基于支持向量机的加密流量和非加密流量的识别方法SVM-ID,并将特征子空间φSVM={h0,h1,h2,h3,perror}作为SVM-ID方法的输入.将SVM-ID方法和相对熵方法进行对比实验,结果表明,所提方法不仅能够很好地识别加密流量,还能区分加密流量和非加密的压缩文件流量.
Abstract:
The existing methods of encrypted traffic classification are difficult to effectively distinguish encrypted traffic and compressed file traffic. Through analyzing the encrypted traffic, txt traffic, doc traffic, jpg traffic,and compressed file traffic, it is found that the methods based on information entropy can effectively separate the low entropy traffic and the high entropy traffic. However, this method cannot distinguish non-encrypted compressed file traffic with byte randomness and full flow pseudo randomness. Therefore, the relative entropy feature vector {h0,h1,h2,h3} is employed to distinguish the low entropy traffic and the high entropy traffic,and the Monte Carlo simulation method is used to estimate the error of π value, perror, which can be used to distinguish the local random traffic and the whole random traffic. Finally, a support vector machine(SVM)-based identification method(SVM-ID)for encrypted traffic and non encrypted traffic is proposed. And, the SVM-ID method uses the feature space φSVM={h0,h1,h2,h3,perror} as the input. The SVM-ID method is compared with the relative entropy method. The experimental results show that the proposed method can not only identify the encrypted traffic well, but also distinguish the encrypted traffic and the non-encrypted compressed file traffic.

参考文献/References:

[1] Fadlullah Z M, Taleb T, Vasilakos A V, et al. DTRAB: Combating against attacks on encrypted protocols through traffic-feature analysis[J]. IEEE/ACM Transactions on Networking, 2010, 18(4): 1234-1247. DOI:10.1109/tnet.2009.2039492.
[2] Gu G, Perdisci R, Zhang J, et al. BotMiner: Clustering analysis of network traffic for protocol- and structure-independent botnet detection[C]//USENIX Security Symposium. San Jose, CA, USA, 2008: 139-154.
[3] Tankard C. Advanced persistent threats and how to monitor and deter them[J]. Network Security, 2011, 2011(8): 16-19. DOI:10.1016/s1353-4858(11)70086-1.
[4] 潘吴斌,程光,郭晓军,等.网络加密流量识别研究综述及展望[J].通信学报,2016,37(9):154-167. DOI:10.11959/j.issn.1000-436x.2016187.
Pan Wubin, Cheng Guang, Guo Xiaojun, et al. Review and perspective on encrypted traffic identification research[J]. Journal on Communications, 2016, 37(9): 154-167. DOI:10.11959/j.issn.1000-436x.2016187. (in Chinese)
[5] Cao Z, Xiong G, Zhao Y, et al. A survey on encrypted traffic classification[C]//International Conference on Applications and Techniques in Information Security. Berlin: Springer, 2014, 490: 73-81. DOI:10.1007/978-3-662-45670-5_8.
[6] 赵博,郭虹,刘勤让,等.基于加权累积和检验的加密流量盲识别算法[J].软件学报,2013,24(6):1334-1345.
  Zhao Bo, Guo Hong, Liu Qinrang, et al. Protocol independent identification of encrypted traffic based on weighted cumulative sum test[J]. Journal of Software, 2013, 24(6): 1334-1345.(in Chinese)
[7] Bonfiglio D, Mellia M, Meo M, et al. Revealing skype traffic: When randomness plays with you[J]. ACM SIGCOMM Computer Communication Review, 2007, 37(4): 37-48. DOI:10.1145/1282427.1282386.
[8] Okada Y, Ata S, Nakamura N, et al. Comparisons of machine learning algorithms for application identification of encrypted traffic[C]//10th IEEE International Conference on Machine Learning and Applications and Workshops. Honolulu, USA, 2011, 2: 358-361. DOI:10.1109/icmla.2011.162.
[9] Dorfinger P, Panholzer G, John W. Entropy estimation for real-time encrypted traffic identification(short paper)[C]//International Workshop on Traffic Monitoring and Analysis. Vienna, Austria, 2011: 164-171. DOI:10.1007/978-3-642-20305-3_14.
[10] Sun G L, Xue Y, Dong Y, et al. A novel hybrid method for effectively classifying encrypted traffic[C]//2010 IEEE Global Telecommunications Conference. Miami, USA, 2010: 1-5. DOI:10.1109/glocom.2010.5683649.
[11] Callado A, Kelner J, Sadok D, et al. Better network traffic identification through the independent combination of techniques[J]. Journal of Network and Computer Applications, 2010, 33(4): 433-446. DOI:10.1016/j.jnca.2010.02.002.
[12] Alshammari R, Zincir-Heywood A N. Can encrypted traffic be identified without port numbers, IP addresses and payload inspection?[J]. Computer Networks, 2011, 55(6): 1326-1350. DOI:10.1016/j.comnet.2010.12.002.
[13] Wang Y, Zhang Z, Guo L, et al. Using entropy to classify traffic more deeply[C]//2011 IEEE Sixth International Conference on Networking, Architecture, and Storage. Dalian, China, 2011. DOI:10.1109/nas.2011.18.
[14] 徐峻岭,周毓明,陈林,等.基于互信息的无监督特征选择[J].计算机研究与发展,2012,49(2):372-382.
  Xu Junling, Zhou Yuming, Chen Lin, et al. An unsupervised feature selection approach based on mutual information[J]. Journal of Computer Research and Development, 2012, 49(2): 372-382.(in Chinese)
[15] Burges C J C. A tutorial on support vector machines for pattern recognition[J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[16] Bernaille L, Teixeira R. Early recognition of encrypted applications[C]//International Conference on Passive and Active Network Measurement. Louvain-la-neuve, Belgium, 2007: 165-175.

相似文献/References:

[1]刘光杰,林嵘,戴跃伟,等.用于EZW编码图像中的安全隐写算法[J].东南大学学报(自然科学版),2008,38(6):975.[doi:10.3969/j.issn.1001-0505.2008.06.008]
 Liu Guangjie,Lin Rong,Dai Yuewei,et al.Secure steganographical algorithm for EZW-encoded images[J].Journal of Southeast University (Natural Science Edition),2008,38(4):975.[doi:10.3969/j.issn.1001-0505.2008.06.008]
[2]陈华友,刘春林.群决策中基于不同偏好信息的相对熵集成方法[J].东南大学学报(自然科学版),2005,35(2):311.[doi:10.3969/j.issn.1001-0505.2005.02.033]
 Chen Huayou,Liu Chunlin.Relative entropy aggregation method in group decision making based on different types of preference information[J].Journal of Southeast University (Natural Science Edition),2005,35(4):311.[doi:10.3969/j.issn.1001-0505.2005.02.033]

备注/Memo

备注/Memo:
收稿日期: 2016-12-04.
作者简介: 程光(1973—),男,博士,教授,博士生导师, gcheng@njnet.edu.cn.
基金项目: 国家高技术研究发展计划(863计划)资助项目(2015AA015603)、国家自然科学基金资助项目(61602114)、中兴通讯研究基金资助项目、软件新技术与产业化协同创新中心资助项目.
引用本文: 程光,陈玉祥.基于支持向量机的加密流量识别方法[J].东南大学学报(自然科学版),2017,47(4):655-659. DOI:10.3969/j.issn.1001-0505.2017.04.005.
更新日期/Last Update: 2017-07-20