[1]焦磊,刘晓军,刘庭煜,等.面向生产调度规则挖掘的关键属性提取技术[J].东南大学学报(自然科学版),2016,46(3):464-469.[doi:10.3969/j.issn.1001-0505.2016.03.002]
 Jiao Lei,Liu Xiaojun,Liu Tingyu,et al.Attribute extraction for rule discovery of production scheduling[J].Journal of Southeast University (Natural Science Edition),2016,46(3):464-469.[doi:10.3969/j.issn.1001-0505.2016.03.002]
点击复制

面向生产调度规则挖掘的关键属性提取技术()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
46
期数:
2016年第3期
页码:
464-469
栏目:
计算机科学与工程
出版日期:
2016-05-20

文章信息/Info

Title:
Attribute extraction for rule discovery of production scheduling
作者:
焦磊12刘晓军12刘庭煜3倪中华12
1东南大学机械工程学院, 南京 211189; 2东南大学江苏省微纳生物医疗器械设计与制造重点实验室, 南京 211189; 3南京理工大学机械工程学院, 南京 210094
Author(s):
Jiao Lei12 Liu Xiaojun12 Liu Tingyu3 Ni Zhonghua12
1School of Mechanical Engineering, Southeast University, Nanjing 211189, China
2Jiangsu Key Laboratory for Design and Manufacture of Micro-Nano Biomedical Instruments, Southeast University, Nanjing 211189, China
3School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
关键词:
数据挖掘 属性提取 模糊数学 模糊熵
Keywords:
data mining attribute extraction fuzzy math fuzzy entropy
分类号:
TP391
DOI:
10.3969/j.issn.1001-0505.2016.03.002
摘要:
针对生产调度规则提取工作对数据集属性约简的客观需求,提出了一种关键属性提取技术.首先,分析了生产数据的特点,并依据重要性和关联性,将生产数据的属性划分为多个集合;然后,在此基础上利用模糊熵与聚类准确度建立重要性目标函数,用于发现重要属性.最后,利用关联性分析查找重要属性的关联属性,将相关属性进行合并,形成重要复合属性,以进一步增强属性提取效果.为了验证该技术的有效性,将利用该技术所获取的数据子集与通过随机法所得到的数据子集进行了对比,分析比较了各数据子集的相容性和规则提取准确性.结果表明,提取属性后所形成的数据子集具有较低不相容度,浓缩了原始数据集的调度规则知识,可显著提升多种生产调度规则挖掘算法的准确度与效率.该技术非常适用于生产调度规则挖掘数据预处理阶段的关键属性提取工作.
Abstract:
An algorithm for attribute extraction is proposed to meet the objective demand of production scheduling rule discovery for data set attribute reduction. Firstly, the characteristics of the production data are analyzed, and the attributes of production data are divided into several sets according to their importance and correlation. Then, the importance objective function is established to find the important attributes by using the fuzzy entropy and the clustering accuracy. Finally, the correlation analysis is used to find the related attributes of the important attribute, which are then merged to form the important composite attribute to enhance the effect of attribute extraction. In order to verify the validity of the technology, a subset obtained by the technique is compared with another subset obtained by the stochastic method, and the compatibility and the accuracy of rule extraction between them are analyzed. The experimental results show that the data subset formed by attribute extraction has lower incompatibility and can concentrate the scheduling rule knowledge of the original data sets, which mean that the accuracy and efficiency of a variety of scheduling rule discovery algorithms can be improved significantly. Thus, the technology developed is suitable for the attribute extraction in the preprocessing stage of the production scheduling rule discovery.

参考文献/References:

[1] Balasundaram R, Basker N, Sanker R S. Discovering dispatching rules for job shop scheduling using data mining[C]//Proceedings of the Second International Conference on Advances in Computing and Information Technology. Chennai, India, 2013: 63-72. DOI:10.1007/978-3-642-31600-5_7.
[2] Li L, Sun Z J, Ni J C, et al. Data-based scheduling framework and adaptive dispatching rule of complex manufacturing systems[J]. International Journal of Advanced Manufacturing Technology, 2012, 66(9/10/11/12): 1891-1905. DOI:10.1007/s00170-012-4468-6.
[3] Chen C C, Yih Y. Indentifying attributes for knowledge-based development in dynamic scheduling environments [J]. International Journal of Production Research, 1996, 34(6): 1739-1755. DOI:10.1080/00207549608904994.
[4] Liu Y H, Huang H P, Lin Y S. Attribute selection for the scheduling of flexible manufacturing systems based on fuzzy set—theoretic approach and genetic algorithm[J]. Journal of the Chinese Institute of Industrial Engineers, 2005, 22(1): 46-55. DOI:10.1080/10170660509509276.
[5] 叶建芳,潘晓弘,王正肖,等.基于免疫离散粒子群算法的调度属性选择[J].浙江大学学报(工学版),2009, 43(12):2203-2207.
  Ye Jianfang, Pan Xiaohong, Wang Zhengxiao, et al. Scheduling feature selection based on immune binary partial swarm optimization[J]. Journal of Zhejiang University(Engineering Science), 2009, 43(12): 2203-2207.(in Chinese)
[6] Qiao F, Ma Y M, Gu X. Attribute selection algorithm of data-based scheduling strategy for semiconductor manufacturing[C]//IEEE International Conference on Automation Science and Engineering(CASE). Madison, WI, USA, 2013: 410-415. DOI:10.1109/coase.2013.6654027.
[7] Korytkowski P, Rymaszewski S, Wisniewski T. Ant colony optimization for job shop scheduling using multi-attribute dispatching rules[J]. The International Journal of Advance Manufacturing Technology, 2013, 67: 231-241. DOI:10.1007/s00170-013-4769-4.
[8] Kashfi M A, Javadi M. A model for selecting suitable dispatching rule in FMS based on fuzzy multi attribute group decision making[J]. Production Engineering, 2015, 9(2): 237-246. DOI:10.1007/s11740-015-0603-1.
[9] Olafsson S, Li X N. Learning effective new single machine dispatching rules from optimal scheduling data[J]. International Journal of Production Economics, 2010, 128(1): 118-126. DOI:10.1016/j.ijpe.2010.06.004.
[10] Shahzad A, Mebarki N. Data mining based job dispatching using hybrid simulation-optimization approach for shop scheduling problem[J]. Engineering Applications of Artificial Intelligence, 2012, 25(6): 1173-1181. DOI:10.1016/j.engappai.2012.04.001.
[11] Dash M, Liu H. Feature selection for classification[J]. Intelligent Data Analysis, 1997, 1(3): 131-156.
[12] Maji P, Garai P. Fuzzy-rough simultaneous attribute selection and feature extraction algorithm[J]. IEEE Transactions on Cybernetics, 2013, 43(4): 1166-1177. DOI:10.1109/TSMCB.2012.2225832.
[13] Han J W, Kamber M. Data mining concepts and techniques [M]. 2nd ed. San Francisco, CA, USA: Morgan Kaufmann Publishers, 2006: 290-291.
[14] 苗夺谦.Rough set理论中连续属性的离散化方法[J].自动化学报,2001,27(3):296-302.
  Miao Duoqian. A new method of discretization of continuous attributes in rough sets[J]. Acta Automatica Sinica, 2001, 27(3): 296-302.(in Chinese)

相似文献/References:

[1]赵传申,孙志挥.半结构化文档数据流的快速频繁模式挖掘[J].东南大学学报(自然科学版),2006,36(3):452.[doi:10.3969/j.issn.1001-0505.2006.03.025]
 Zhao Chuanshen,Sun Zhihui.Fast mining frequent patterns in semi-structured data stream[J].Journal of Southeast University (Natural Science Edition),2006,36(3):452.[doi:10.3969/j.issn.1001-0505.2006.03.025]
[2]陆建江,徐宝文,邹晓峰,等.模糊关联规则的并行挖掘算法[J].东南大学学报(自然科学版),2005,35(2):165.[doi:10.3969/j.issn.1001-0505.2005.02.001]
 Lu Jianjiang,Xu Baowen,Zou Xiaofeng,et al.Parallel mining algorithm for fuzzy association rules[J].Journal of Southeast University (Natural Science Edition),2005,35(3):165.[doi:10.3969/j.issn.1001-0505.2005.02.001]
[3]丁艺明,金远平.一种基于记录分区的多值关联规则挖掘算法[J].东南大学学报(自然科学版),2000,30(2):6.[doi:10.3969/j.issn.1001-0505.2000.02.002]
 Ding Yiming,Jin Yuanping.A Record Partition Based Algorithm for Mining Quantitative Association Rules[J].Journal of Southeast University (Natural Science Edition),2000,30(3):6.[doi:10.3969/j.issn.1001-0505.2000.02.002]
[4]朱慧云,陈森发,张丽杰.动态环境下多个时期的客户购物模式变化挖掘[J].东南大学学报(自然科学版),2012,42(5):1012.[doi:10.3969/j.issn.1001-0505.2012.05.038]
 Zhu Huiyun,Chen Senfa,Zhang Lijie.Change mining of customer shopping patterns from multi-period datasets under dynamic environment[J].Journal of Southeast University (Natural Science Edition),2012,42(3):1012.[doi:10.3969/j.issn.1001-0505.2012.05.038]
[5]陆介平,刘月波,倪巍伟,等.基于PrefixSpan的快速交互序列模式挖掘算法[J].东南大学学报(自然科学版),2005,35(5):692.[doi:10.3969/j.issn.1001-0505.2005.05.008]
 Lu Jieping,Liu Yuebo,Ni Weiwei,et al.Fast interactive sequential pattern mining algorithm based on PrefixSpan[J].Journal of Southeast University (Natural Science Edition),2005,35(3):692.[doi:10.3969/j.issn.1001-0505.2005.05.008]
[6]张净,孙志挥.GDLOF:基于网格和稠密单元的快速局部离群点探测算法[J].东南大学学报(自然科学版),2005,35(6):863.[doi:10.3969/j.issn.1001-0505.2005.06.007]
 Zhang Jing,Sun Zhihui.GDLOF: fast local outlier detection algorithm with grid-based and dense cell[J].Journal of Southeast University (Natural Science Edition),2005,35(3):863.[doi:10.3969/j.issn.1001-0505.2005.06.007]
[7]杨明,孙志挥,吉根林.一种基于分布式数据库的全局频繁项目集更新算法[J].东南大学学报(自然科学版),2002,32(6):879.[doi:10.3969/j.issn.1001-0505.2002.06.012]
 Yang Ming,Sun Zhihui,Ji Genlin.Algorithm based on distributed database for updating global frequent itemsets[J].Journal of Southeast University (Natural Science Edition),2002,32(3):879.[doi:10.3969/j.issn.1001-0505.2002.06.012]
[8]陈岭,陈元中,陈根才.基于操作序列挖掘的OLAP查询推荐方法[J].东南大学学报(自然科学版),2011,41(3):498.[doi:10.3969/j.issn.1001-0505.2011.03.013]
 Chen Ling,Chen Yuanzhong,Chen Gencai.Operation sequence mining based OLAP query recommendation method[J].Journal of Southeast University (Natural Science Edition),2011,41(3):498.[doi:10.3969/j.issn.1001-0505.2011.03.013]
[9]胡孔法,唐小丽,达庆利,等.一种高效挖掘高维数据的频繁闭合模式算法[J].东南大学学报(自然科学版),2007,37(4):569.[doi:10.3969/j.issn.1001-0505.2007.04.005]
 Hu Kongfa,Tang Xiaoli,Da Qingli,et al.Efficient algorithm for frequent closed patterns mining from high dimensional data[J].Journal of Southeast University (Natural Science Edition),2007,37(3):569.[doi:10.3969/j.issn.1001-0505.2007.04.005]
[10]龚振志,胡孔法,达庆利,等.DMGSP:一种快速分布式全局序列模式挖掘算法[J].东南大学学报(自然科学版),2007,37(4):574.[doi:10.3969/j.issn.1001-0505.2007.04.006]
 Gong Zhenzhi,Hu Kongfa,Da Qingli,et al.DMGSP: an algorithm of distributed mining global sequential pattern on distributed system[J].Journal of Southeast University (Natural Science Edition),2007,37(3):574.[doi:10.3969/j.issn.1001-0505.2007.04.006]

备注/Memo

备注/Memo:
收稿日期: 2015-10-28.
作者简介: 焦磊(1983—),男,博士生;刘晓军(联系人),男,博士,副教授, liuxiaojun@seu.edu.cn.
基金项目: 国家自然科学基金资助项目(51405081)、江苏省科技成果转化资助项目(BA2014114)、苏州市科技发展计划资助项目(SYG201221).
引用本文: 焦磊,刘晓军,刘庭煜,等.面向生产调度规则挖掘的关键属性提取技术[J].东南大学学报(自然科学版),2016,46(3):464-469. DOI:10.3969/j.issn.1001-0505.2016.03.002.
更新日期/Last Update: 2016-05-20