[1]陈岭,陈元中,陈根才.基于操作序列挖掘的OLAP查询推荐方法[J].东南大学学报(自然科学版),2011,41(3):498-504.[doi:10.3969/j.issn.1001-0505.2011.03.013]
 Chen Ling,Chen Yuanzhong,Chen Gencai.Operation sequence mining based OLAP query recommendation method[J].Journal of Southeast University (Natural Science Edition),2011,41(3):498-504.[doi:10.3969/j.issn.1001-0505.2011.03.013]
点击复制

基于操作序列挖掘的OLAP查询推荐方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
41
期数:
2011年第3期
页码:
498-504
栏目:
自动化
出版日期:
2011-05-20

文章信息/Info

Title:
Operation sequence mining based OLAP query recommendation method
作者:
陈岭陈元中陈根才
(浙江大学计算机科学与技术学院,杭州 310027)
Author(s):
Chen LingChen YuanzhongChen Gencai
(College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China)
关键词:
联机分析处理数据挖掘查询推荐
Keywords:
online analytical processing (OLAP) data mining query recommendation
分类号:
TP181
DOI:
10.3969/j.issn.1001-0505.2011.03.013
摘要:
针对联机分析处理(OLAP)操作复杂导致的用户使用效率低下问题,提出基于操作序列挖掘的OLAP查询推荐方法.首先从多维表达式(MDX)查询语句记录中提取整数数列形式的查询序列,再利用PrefixSpan方法对查询序列进行频繁序列模式挖掘,并基于挖掘出的模式及其子模式建立概率矩阵,最后通过搜索与用户当前查询操作或查询序列匹配的候选模式对其下一步查询操作进行预测,并将预测结果按概率大小分级推荐.在7位OLAP专业分析人员的查询分析日志数据集上对提出的查询推荐方法进行性能评价,实验结果表明:使用用户相关模型前5推荐内容的平均正确率为92.20%,其中第1推荐的平均正确率为77.06%.
Abstract:
An operation sequence mining based OLAP (online analytical processing) query recommendation method is proposed to counter the low efficiency problem caused by the complexity of OLAP query operations. First, query sequences in the form of numerical array are extracted from continuous MDX (multidimensional expression) query operations. Then, the PrefixSpan mining algorithm is exploited to obtain the frequent sequential patterns from query sequences, and a matrix of probabilities is established upon mined patterns and their sub-patterns. Finally, the next operation of current user is predicted by searching candidate patterns matched with the user’s query operation or query sequence, and the prediction results are ranked according to the magnitude of probabilities. The performance of the proposed query recommendation method is evaluated with an OLAP query operation dataset recorded by seven professional OLAP users. The results show that with user-specific recommendation models, the average accuracy rates of the top five recommendations and the first recommendation are 92. 20% and 77. 06%, respectively.

参考文献/References:

[1] Baeza-Yates R,Hurtado C,Mendoza M.Query recommendation using query logs in search engines [C]//Proceedings of International Workshop on Clustering Information Over the Web.Heraklion,Crete,Greece,2004:588-596.
[2] Srivastava J,Cooley R,Deshpande M,et al.Web usage mining:discovery and applications of usage patterns from Web data [J].ACM SIGKDD Explorations Newsletter,2000,1(2):12-23.
[3] Koutrika G,Ikeda R,Bercovitz B,et al.Flexible recommendations over rich data [C]//Proceedings of ACM International Conference on Recommender Systems.Lausanne,Switzerland,2008:203-210.
[4] Satzger B,Endres M,Kiebling W.A preference-based recommender system [C]//International Conference on Electronic Commerce and Web Technologies.Cracow,Poland,2006:31-40.
[5] Giacometti A,Marcel P,Negre E.A framework for recommending OLAP queries [C]//Proceedings of ACM International Workshop on Data Warehousing and OLAP.Hong Kong,China,2008:73-80.
[6] Bellatreche L,Giacometti A,Marcel P,et al.A personalization framework for OLAP queries [C]//Proceedings of ACM International Workshop on Data Warehousing and OLAP.Bremen,Germany,2005:9-18.
[7] Jerbi H,Ravat F,Teste O,et al.Applying recommendation technology in OLAP systems [C]//International Conference on Enterprise Information Systems.Beijing,China,2009:220-233.
[8] Jerbi H,Ravat F,Teste O,et al.Management of context-aware preferences in multidimensional databases [C]//International Conference on Digital Information Management.London,2008:669-675.
[9] Sapia C.On modeling and predicting query behavior in OLAP systems [C]//Proceedings of International Workshop on Design and Management of Data Warehouse.Heidelberg,Germany,1999:1-10.
[10] Sapia C.Promise:predicting query behavior to enable predictive caching strategies for OLAP systems [C]//International Conference on Data Warehousing and Knowledge Discovery.London,2000:224-233.
[11] Pei J,Han J,Mortazavi-Asl B,et al.Mining sequential patterns by pattern-growth:the PrefixSpan approach [J].IEEE Transactions on Knowledge and Data Engineering,2004,16(10):1424-1440.
[12] Pei J,Han J,Mortazavi-Asl B,et al.PrefixSpan:mining sequential patterns efficiently by prefix-projected pattern growth [C]//International Conference on Data Engineering.Heidelberg,Germany,2001:215-224.
[13] Ye Q,Chen L,Chen G.Predict personal continuous route [C]//Proceedings of International IEEE Conference on Intelligent Transportation Systems.Beijing,China,2008:587-592.

相似文献/References:

[1]赵传申,孙志挥.半结构化文档数据流的快速频繁模式挖掘[J].东南大学学报(自然科学版),2006,36(3):452.[doi:10.3969/j.issn.1001-0505.2006.03.025]
 Zhao Chuanshen,Sun Zhihui.Fast mining frequent patterns in semi-structured data stream[J].Journal of Southeast University (Natural Science Edition),2006,36(3):452.[doi:10.3969/j.issn.1001-0505.2006.03.025]
[2]陆建江,徐宝文,邹晓峰,等.模糊关联规则的并行挖掘算法[J].东南大学学报(自然科学版),2005,35(2):165.[doi:10.3969/j.issn.1001-0505.2005.02.001]
 Lu Jianjiang,Xu Baowen,Zou Xiaofeng,et al.Parallel mining algorithm for fuzzy association rules[J].Journal of Southeast University (Natural Science Edition),2005,35(3):165.[doi:10.3969/j.issn.1001-0505.2005.02.001]
[3]丁艺明,金远平.一种基于记录分区的多值关联规则挖掘算法[J].东南大学学报(自然科学版),2000,30(2):6.[doi:10.3969/j.issn.1001-0505.2000.02.002]
 Ding Yiming,Jin Yuanping.A Record Partition Based Algorithm for Mining Quantitative Association Rules[J].Journal of Southeast University (Natural Science Edition),2000,30(3):6.[doi:10.3969/j.issn.1001-0505.2000.02.002]
[4]朱慧云,陈森发,张丽杰.动态环境下多个时期的客户购物模式变化挖掘[J].东南大学学报(自然科学版),2012,42(5):1012.[doi:10.3969/j.issn.1001-0505.2012.05.038]
 Zhu Huiyun,Chen Senfa,Zhang Lijie.Change mining of customer shopping patterns from multi-period datasets under dynamic environment[J].Journal of Southeast University (Natural Science Edition),2012,42(3):1012.[doi:10.3969/j.issn.1001-0505.2012.05.038]
[5]陆介平,刘月波,倪巍伟,等.基于PrefixSpan的快速交互序列模式挖掘算法[J].东南大学学报(自然科学版),2005,35(5):692.[doi:10.3969/j.issn.1001-0505.2005.05.008]
 Lu Jieping,Liu Yuebo,Ni Weiwei,et al.Fast interactive sequential pattern mining algorithm based on PrefixSpan[J].Journal of Southeast University (Natural Science Edition),2005,35(3):692.[doi:10.3969/j.issn.1001-0505.2005.05.008]
[6]张净,孙志挥.GDLOF:基于网格和稠密单元的快速局部离群点探测算法[J].东南大学学报(自然科学版),2005,35(6):863.[doi:10.3969/j.issn.1001-0505.2005.06.007]
 Zhang Jing,Sun Zhihui.GDLOF: fast local outlier detection algorithm with grid-based and dense cell[J].Journal of Southeast University (Natural Science Edition),2005,35(3):863.[doi:10.3969/j.issn.1001-0505.2005.06.007]
[7]杨明,孙志挥,吉根林.一种基于分布式数据库的全局频繁项目集更新算法[J].东南大学学报(自然科学版),2002,32(6):879.[doi:10.3969/j.issn.1001-0505.2002.06.012]
 Yang Ming,Sun Zhihui,Ji Genlin.Algorithm based on distributed database for updating global frequent itemsets[J].Journal of Southeast University (Natural Science Edition),2002,32(3):879.[doi:10.3969/j.issn.1001-0505.2002.06.012]
[8]胡孔法,唐小丽,达庆利,等.一种高效挖掘高维数据的频繁闭合模式算法[J].东南大学学报(自然科学版),2007,37(4):569.[doi:10.3969/j.issn.1001-0505.2007.04.005]
 Hu Kongfa,Tang Xiaoli,Da Qingli,et al.Efficient algorithm for frequent closed patterns mining from high dimensional data[J].Journal of Southeast University (Natural Science Edition),2007,37(3):569.[doi:10.3969/j.issn.1001-0505.2007.04.005]
[9]龚振志,胡孔法,达庆利,等.DMGSP:一种快速分布式全局序列模式挖掘算法[J].东南大学学报(自然科学版),2007,37(4):574.[doi:10.3969/j.issn.1001-0505.2007.04.006]
 Gong Zhenzhi,Hu Kongfa,Da Qingli,et al.DMGSP: an algorithm of distributed mining global sequential pattern on distributed system[J].Journal of Southeast University (Natural Science Edition),2007,37(3):574.[doi:10.3969/j.issn.1001-0505.2007.04.006]
[10]肖利,金远平,徐宏炳,等.一个新的挖掘广义关联规则算法[J].东南大学学报(自然科学版),1997,27(6):76.[doi:10.3969/j.issn.1001-0505.1997.06.015]
 Xiao Li,Jin Yuanping,Xu Hongbing,et al.A New Algorithm for Mining Generalized Association Rules[J].Journal of Southeast University (Natural Science Edition),1997,27(3):76.[doi:10.3969/j.issn.1001-0505.1997.06.015]

备注/Memo

备注/Memo:
作者简介:陈岭(1977—),男,博士,副教授,lingchen@cs.zju.edu.cn.
基金项目:国家自然科学基金资助项目(60703040)、浙江省科技计划优先主题资助项目(2007C13019)、浙江省自然科学基金资助项目(Y107178).
引文格式: 陈岭,陈元中,陈根才.基于操作序列挖掘的OLAP查询推荐方法[J].东南大学学报:自然科学版,2011,41(3):498-504.[doi:10.3969/j.issn.1001-0505.2011.03.013]
更新日期/Last Update: 2011-05-20