[1]闫雷鸣,孙志挥,张柏礼.一种时序数据局部相关对象聚类算法[J].东南大学学报(自然科学版),2007,37(5):793-797.[doi:10.3969/j.issn.1001-0505.2007.05.011]
 Yan Leiming,Sun Zhihui,Zhang Baili.Fast biclustering algorithm for local correlated objects in time series data[J].Journal of Southeast University (Natural Science Edition),2007,37(5):793-797.[doi:10.3969/j.issn.1001-0505.2007.05.011]
点击复制

一种时序数据局部相关对象聚类算法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
37
期数:
2007年第5期
页码:
793-797
栏目:
计算机科学与工程
出版日期:
2007-09-20

文章信息/Info

Title:
Fast biclustering algorithm for local correlated objects in time series data
作者:
闫雷鸣 孙志挥 张柏礼
东南大学计算机科学与工程学院, 南京 210096
Author(s):
Yan Leiming Sun Zhihui Zhang Baili
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
关键词:
双聚类 时间序列 后缀树 局部相关
Keywords:
biclustering time series suffix tree local correlation
分类号:
TP311
DOI:
10.3969/j.issn.1001-0505.2007.05.011
摘要:
针对高维时序数据中局部相关模式的聚类问题,建立了一种基于相关子模式的spCluster模型,讨论了该模型与平均平方残值的关系.并以此模型为基础,提出了适用于时序数据的确定性双聚类算法sp-TSC,该算法首先利用spCluster模型将局部相关的数据对象符号化,然后将字符序列插入到泛化后缀树中,利用后缀树的性质避免了穷举局部相关子模式的各种组合,有效减小了搜索空间,从而可以在数据矩阵尺寸的线性时间内发现全部最大δ-spCluster.理论分析和实验表明,该算法是高效可行的.
Abstract:
The biclustering for local correlation patterns in high dimensional data can find many valuable clusters. A novel time-series data spCluster model based on sub-pattern correlation is presented, which is derived from mean squared residue score. Furthermore, a new clustering algorithm, sp-TSC(sub-pattern time-series clustering algorithm), is proposed to find and report all relevant δ-spClusters in time series data with a suffix tree index structure. sp-TSC algorithm characterizes data matrix at the first stage, and then inserts them into generalized suffix tree in order to discover all maximum δ-spClusters in time linear without greedy searching. Compared with common biclustering algorithms, the sp-TSC algorithm is more efficient in performance. Experimental study on a real gene expression datasets demonstrates the effectiveness and feasibility of the algorithm.

参考文献/References:

[1] Aggarwal C C,Yu P S.Finding generalized projected clusters in high dimensional spaces [C] //Proc of ACM SIGMOD Int’l Conf on Management of Data.Dallas:TX,USA,2000:70-81.
[2] Cheng Yizong,Church G M.Biclustering of expression data [C] //Proc of the 8th Int’l Conf on Intelligent Systems for Molecular Biology.San Diego,USA,2000:93-103.
[3] Yang Jiong,Wang Wei,Wang Haixun,et al.δ-clusters:capturing subspace correlation in a large data set [C] //Proc of the 18th IEEE Int’l Conf on Data Engineering.San Jose,USA,2002:517-528.
[4] Wang Haixun,Wang Wei,Yang Jiong,et al.Clustering by pattern similarity in large data sets [C] //Proc of ACM SIGMOD Int’l Conf on Management of Data.Madison,USA,2002:394-405.
[5] Liu Jinze,Wang Wei.Op-cluster:clustering by tendency in high dimensional space [C] //Proc of the 3rd IEEE Int’l Conf on Data Mining.Melbourne,Australia,2003:187-194.
[6] Bohm C,Kailing K,Kroger P,et al.Computing clusters of correlation connected objects [C] //Proc of ACM SIGMOD Conf on Management of Data.Paris,France,2004:455-466.
[7] Zhang Ya,Zha Hongyun,Chu Chao Hisen.A time-series biclustering algorithm for revealing co-regulated genes [C] //Proc of the Int’l Conf on Information Technology:Coding and Computing.Las Vegas,USA,2005:32-37.
[8] Agrawal R,Gehrke J,Gunopulos D,et al.Automatic subspace clustering of high dimensional data for data mining applications [C] //Proc ACM SIGMOD Int’l Conf on Management of Data.Seattle,WA,USA,1998:94-105.
[9] McCreight E.A space economical suffix tree construction algorithm [J]. Journal of the ACM,1976,23(2):262-272.
[10] Ukkonen E.On-line construction of suffix-trees [J].Algorithmica,1995,14(3):249-260.
[11] 靳晓明,陆玉昌,石纯一.序列中的一般化局部序列模式发现 [J].软件学报,2003,14(5):970-975.
  Jin Xiaoming,Lu Yuchang,Shi Chunyi.Discovery of generalized local sequential patterns[J]. Journal of Software,2003,14(5):970-975.(in Chinese)

相似文献/References:

[1]吴涓,宋爱国,李建清.基于时间序列的虚拟接触力的建模方法研究[J].东南大学学报(自然科学版),2005,35(2):239.[doi:10.3969/j.issn.1001-0505.2005.02.017]
 Wu Juan,Song Aiguo,Li Jianqing.Research on modeling of virtual contact force based on time series[J].Journal of Southeast University (Natural Science Edition),2005,35(5):239.[doi:10.3969/j.issn.1001-0505.2005.02.017]
[2]黄仁,时修荣,沙勇,等.磨削烧伤在线辨识的理论研究[J].东南大学学报(自然科学版),1988,18(2):1.[doi:10.3969/j.issn.1001-0505.1988.02.001]
 Huang Ren Shi Xiuiong Sha Yong Sun Xiaoiun Su Guisheng (Department of Mechanical Engineering).A Theoretical Study of On-line Identification of the Grinding Burn[J].Journal of Southeast University (Natural Science Edition),1988,18(5):1.[doi:10.3969/j.issn.1001-0505.1988.02.001]
[3]陈熙源,万德钧,程启明,等.陀螺随机漂移的神经网络预报方法研究[J].东南大学学报(自然科学版),1998,28(5):79.[doi:10.3969/j.issn.1001-0505.1998.05.015]
 Chen Xiyuan,Wan Dejun,Cheng Qiming,et al.Forecasting Random Drift Rate for Strapdown Gyro by Neural Networks[J].Journal of Southeast University (Natural Science Edition),1998,28(5):79.[doi:10.3969/j.issn.1001-0505.1998.05.015]

备注/Memo

备注/Memo:
作者简介: 闫雷鸣(1973—),男,博士生; 孙志挥(联系人),男,教授,博士生导师,sunzh@seu.edu.cn.
更新日期/Last Update: 2007-09-20