[1]宋爱波,胡孔法,董逸生.Web日志挖掘[J].东南大学学报(自然科学版),2002,32(1):15-18.[doi:10.3969/j.issn.1001-0505.2002.01.004]
 Song Aibo,Hu Kongfa,Dong Yisheng.Research on Weblog mining[J].Journal of Southeast University (Natural Science Edition),2002,32(1):15-18.[doi:10.3969/j.issn.1001-0505.2002.01.004]
点击复制

Web日志挖掘()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
32
期数:
2002年第1期
页码:
15-18
栏目:
自动化
出版日期:
2002-01-20

文章信息/Info

Title:
Research on Weblog mining
作者:
宋爱波 胡孔法 董逸生
东南大学计算机科学与工程系, 南京 210096
Author(s):
Song Aibo Hu Kongfa Dong Yisheng
Department of Computer Science and Engineering,Southeast University, Nanjing 210096, China
关键词:
Web日志 数据挖掘 模糊聚类 推荐系统 自适应Web站点
Keywords:
Weblog data mining fuzzy cluster recommendation system adaptive Web site
分类号:
TP18
DOI:
10.3969/j.issn.1001-0505.2002.01.004
摘要:
提出了一种新颖的MBP算法,它利用关联规则挖掘发现的频繁项目集以加快速度,能找出所有满足阀值约束的频繁浏览路径,该算法是很有效的.同时,针对Web浏览和日志文件固有的模糊性和不确定性,还讨论了Web页面的模糊聚类问题.最后,对发现的知识讨论了其在推荐系统及自适应Web站点中的应用并给出了相应算法.
Abstract:
Similar customer groups, relevant Web pages, and frequent access paths can be discovered by mining Weblog files. In this paper, based on a survey of current Web mining research, a novel algorithm MBP is presented. It uses frequent items found during association rules mining. MBP can find all frequent access paths meeting threshold constraint. Experiment shows that MBP algorithm is very effective. At the same time, due to the inherent fuzziness and uncertainty in Web browsing and logging, fuzzy clustering for Web pages is also discussed. Finally a discussion concerning the use of discovered knowledge in recommendation system and adaptive Web site as well as some algorithm methods are also given.

参考文献/References:

[1] Cooley R,Mobasher B,Srivastava J.Data preparation for mining World Wide Web browsing patterns [J].Knowledge and Information Systems,1999,1(1):5-32.
[2] Chen M S,Park J S,Yu P S.Data mining for path traversal patterns in a Web environment.http://citeseer.nj.nec.com/article/chen96data.html.2001-02-02.
[3] Agrawal R,Imielinski T,Swami A.Mining associations between sets of items in large databases[A].In:Buneman Peter,ed. Proceeding of the 1993 ACM-SIGMOD International Conference on Management of Data [C].Washington,DC,1993.207-216.
[4] 宋爱波,董逸生.稠密数据库有趣规则的快速挖掘 [J].小型微型计算机系统,2001,22(7):822-826.
  Song Aibo,Dong Yisheng.Interesting rules fast mining in dense databases [J]. Mini-Micro Systems, 2001,22(7):822-826.(in Chinese)
[5] Han E H,Karypis G,Kumar V,et al.Hypergraph based clustering in high-dimensional data sets:a summary of results [J].IEEE Bulletin of the Technical Committee on Data Engineering,1998,21(1):15-22.
[6] Konstan J,Miller B,Maltz D,et al.Applying collaborative filtering to usenet news [J].Communications of the ACM,1997,40(3):77-87.
[7] Han E H,Karypis G,Kumar V,et al.Clustering in a high-dimensional space using hypergraph model.http://citeseer.nj.nec.com/article/han97clustering.html.2000-12-10.
[8] Han E H,Boley D,Gini M,et al.Document categorization and query generation on the World Wide Web using WebACE[J].Journal of Artificial Intelligence Review,1999,13(5~6):365-391.
[9] Joshi A,Krishnapuram R.Robust fuzzy clustering methods to support Web mining.http://citeseer.nj.nec.com/joshi98robust.html.2001-01-21.
[10] Borges J,Levene M.Mining association rules in hypertext databases.http://citeseer.nj.nec.com/article/borges99mining.html.2001-01-21.
[11] Mobasher B,Cooley R,Srivastara J.Automatic personalization based on Web usage mining [J].Communications of the ACM,2000,43(8):142-151.
[12] Shardanand U,Maes P.Social information filtering:algorithms for automation “word of mouth".http://citeseer.nj.nec.com/article/upendra95social.html.2001-01-21.
[13] Herlocker J,Konstan J,Borchers A.An algorithm framework for performing collaborative filtering[A].In:Proceeding of the 1999 Conference on Research and Development in Information Retrieval [C].ACM Press,1999.230-237.
[14] Perkowitz M,Etzioni O.Adaptive Web sites:automatically synthesizing Web pages.http://citeseer.nj.nec.com/perkowitz98adaptive.html.2001-03-11.

相似文献/References:

[1]吉根林,凌霄汉,杨明.一种基于集成学习的分布式聚类算法[J].东南大学学报(自然科学版),2007,37(4):585.[doi:10.3969/j.issn.1001-0505.2007.04.008]
 Ji Genlin,Ling Xiaohan,Yang Ming.Distributed clustering algorithm based on ensemble learning[J].Journal of Southeast University (Natural Science Edition),2007,37(1):585.[doi:10.3969/j.issn.1001-0505.2007.04.008]
[2]陆介平,刘月波,倪巍伟,等.基于投影数据库的序列模式挖掘增量式更新算法[J].东南大学学报(自然科学版),2006,36(3):457.[doi:10.3969/j.issn.1001-0505.2006.03.026]
 Lu Jieping,Liu Yuebo,Ni Weiwei,et al.Incremental updating algorithm for sequence patterns mining based on projected database[J].Journal of Southeast University (Natural Science Edition),2006,36(1):457.[doi:10.3969/j.issn.1001-0505.2006.03.026]
[3]吉根林,孙志挥.一种基于可信度最优的数量关联规则挖掘算法[J].东南大学学报(自然科学版),2001,31(2):31.[doi:10.3969/j.issn.1001-0505.2001.02.008]
 Ji Genlin,Sun Zhihui.An Algorithm for Mining Optimized Confidence Quantitative Association Rules[J].Journal of Southeast University (Natural Science Edition),2001,31(1):31.[doi:10.3969/j.issn.1001-0505.2001.02.008]
[4]胡孔法,张长海,陈崚,等.一种面向物流数据分析的路径序列挖掘算法ImGSP[J].东南大学学报(自然科学版),2008,38(6):970.[doi:10.3969/j.issn.1001-0505.2008.06.007]
 Hu Kongfa,Zhang Changhai,Chen Ling,et al.ImGSP:a path sequence mining algorithm for product flow analysis[J].Journal of Southeast University (Natural Science Edition),2008,38(1):970.[doi:10.3969/j.issn.1001-0505.2008.06.007]
[5]郭海燕,李枭雄,李拟珺,等.基于基频状态和帧间相关性的单通道语音分离算法[J].东南大学学报(自然科学版),2014,44(6):1099.[doi:10.3969/j.issn.1001-0505.2014.06.001]
 Guo Haiyan,Li Xiaoxiong,Li Nijun,et al.Single-channel speech separation based on pitch state and interframe correlation[J].Journal of Southeast University (Natural Science Edition),2014,44(1):1099.[doi:10.3969/j.issn.1001-0505.2014.06.001]

备注/Memo

备注/Memo:
基金项目: 国家自然科学基金资助项目(79970092).
作者简介: 宋爱波(1970—), 男, 博士生; 董逸生(联系人), 男, 教授, 博士生导师.
更新日期/Last Update: 2002-01-20