[1]王峰,李小平,王茜.基于形式概念分析的模式匹配算法[J].东南大学学报(自然科学版),2009,39(1):34-39.[doi:10.3969/j.issn.1001-0505.2009.01.007]
 Wang Feng,Li Xiaoping,Wang Qian.Formal concept analysis based schema matching[J].Journal of Southeast University (Natural Science Edition),2009,39(1):34-39.[doi:10.3969/j.issn.1001-0505.2009.01.007]
点击复制

基于形式概念分析的模式匹配算法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
39
期数:
2009年第1期
页码:
34-39
栏目:
计算机科学与工程
出版日期:
2009-01-20

文章信息/Info

Title:
Formal concept analysis based schema matching
作者:
王峰 李小平 王茜
东南大学计算机科学与工程学院,南京 210096; 东南大学计算机网络与信息集成教育部重点实验室, 南京 210096
Author(s):
Wang Feng Li Xiaoping Wang Qian
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
Key Laboratory of Computer Network and Information Integration of Ministry of Education, Southeast University, Nanjing 210096, China
关键词:
模式匹配 形式概念分析 朴素贝叶斯分类器 相似评估
Keywords:
schema matching formal concept analysis naive Bayes classifier similarity measure
分类号:
TP391
DOI:
10.3969/j.issn.1001-0505.2009.01.007
摘要:
提出了一种基于形式概念分析的模式匹配的FCABSM方法,该方法由3部分组成:首先,以朴素贝叶斯文本分类算法为基础设计名称分类算法及描述分类算法,分类目标模式与待匹配模式的元素名以及元素描述,为模式间元素的匹配提供初始依据.其次,利用形式概念分析技术整合分类结果、元素类型信息以及约束信息,提高匹配精度.该阶段为待整合信息创建形式上下文、获取形式上下文中蕴涵的概念、确立概念间偏序关系及构建概念格.最后,以第二阶段的概念格为计算依据,引入基于结构的相似评估模型来计算出最终的匹配结果.实验表明,基于FCA的模式匹配方法的平均性能优于缺少FCA整合的直接匹配方法.
Abstract:
A new schema matching approach based on formal concept analysis(FCA)is introduced. The procedure contains three steps. Firstly, the evidence about each element being matched is initialized by applying name classifier and description classifier which are built on Naive Bayes Text Classifier to classify the names and descriptions of the elements. Secondly, FCA is applied to integrate the classified results as well as type messages and constrains to increase the evidence. This step is designed to create formal context for various information to be integrated, acquire the concept contained, figure out the partial order between concepts and construct the concept lattice. At last, a structural similarity measure is introduced to calculate the final matches. Experimental results demonstrate that FCA-based matching outperforms direct matching(without the benefit of FCA).

参考文献/References:

[1] Rahm B,Bernstein P A.A survey of approaches to automatic schema matching [J]. The International Journal on Very Large Data Bases,2001,10(4):334-350.
[2] Miller R J,Hernánden M A,Haas L M,et al.The Clio project:managing heterogeneity [J].SIGMOD Record,2001,30(1):78-83.
[3] Melnik S,Garcia-Molina H,Rahm E.Similarity flooding:a versatile graph matching algorithm[C] //Proceedings of the 18th International Conference on Data Engineering.San Jose,USA,2002:117-128.
[4] Madhavan J,Bernstein P A,Rahm E.Generic schema matching with cupid[C] //Proceedings of the 27th International Conference on Very Large Data Bases.Roma,Italy,2001:49-58.
[5] Nottelmann H,Straccia U.A probabilistic approach to schema matching[C] //Proceedings of the 27th European Conference on Information Retrieval Research.Santiago de Compostela,Spain,2005,3408:81-95.
[6] Bohannon P,Elnahrawy E,Fan W,et al.Putting context into schema matching[C] //Proceedings of the 32nd International Conference on Very Large Data Bases.Seoul,Korea,2006:307-318.
[7] 汪清清,王茜,李小平.网络环境下数据交换方案的设计与实现[J].东南大学学报:自然科学版,2007,37(4):599-604.
  Wang Qingqing,Wan Qian,Li Xiaoping.Design and implementation for a data exchange method on Internet[J].Journal of Southeast University:Natural Science Edition,2007, 37(4):599-604.(in Chinese)
[8] Li W S,Clifton C.Semantic integration in heterogeneous databases using neural networks[C] //Proceedings of the 20th International Conference on Very Large Data Bases.Santiago,Chile,1994:1-12.
[9] Bilke A.Schema matching using duplicates[C] //Proceedings of the 21st International Conference on Data Engineering.Tokyo,Japan,2005:69-80.
[10] Guo Mingchuan,Yu Yong.Mutual enhancement of schema mapping and data mapping[EB/OL].(2004-08-22)[2008-01-22].http://km.aifb.uni-karlsruhe.de/ws/msw2004/camera/MutualEnhancementOfSchemaMapping%20DataMapping.pdf.
[11] Madhavan J.Corpus based schema matching[C] //Proceedings of the 21st International Conference on Data Engineering.Tokyo,Japan,2005:57-68.
[12] 简睿,俞勇.基于形式化概念分析的XML Schema映射[J].上海交通大学学报,2005,39(4):531-534.
  Jian Rui,Yu Yong.Formal concept analysis based XML schema mapping[J]. Journal of Shanghai Jiao Tong University,2005,39(4):531-534.(in Chinese)
[13] Wille R.Restructuring lattice theory:an approach based on hierarchies of concepts[C] //Proceedings of the NATO Advanced Study Institute Series C.Dordrecht,Holland,1982,83:445-470.
[14] Ganter B,Wille R.Formal concept analysis:mathematical foundations [M].Berlin:Springer Verlag,1999.
[15] 曲开社,翟岩慧.偏序集、包含度与形式概念分析[J].计算机学报,2006,29(2):219-226.
  Qu Kaishe,Zhai Yanhui.Posets,inclusion degree theory and FCA[J].Chinese Journal of Computer,2006, 29(2):219-226.(in Chinnese)
[16] de Souza X S,Davis J.Aligning ontologies and evaluating concept similarities[C] //Ontologies,Databases,and Applications of Semantics International Conference.Agia-Napa,Cyprus,2004:1012-1029.
[17] Joachims T.A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization[C] //Proceedings of the Fourteenth International Conference on Machine Learning.Nashville,TN,USA,1997:143-151.

相似文献/References:

[1]李勇,何立权,洪伟,等.鳍线侧边调谐螺杆不连续性的分析[J].东南大学学报(自然科学版),1997,27(6):92.[doi:10.3969/j.issn.1001-0505.1997.06.018]
 Li Yong,He Liquan,Hong Wei,et al.Analysis of Fin-Line Discontinuity of a Tuning Post[J].Journal of Southeast University (Natural Science Edition),1997,27(1):92.[doi:10.3969/j.issn.1001-0505.1997.06.018]

备注/Memo

备注/Memo:
作者简介: 王峰(1984—),男,硕士生; 王茜(联系人),女,教授,博士生导师,qianwang6491@263.net.
基金项目: 国家自然科学基金资助项目(60504029,60672092,60873236)、国家高技术研究发展计划(863计划)资助项目(2008AA04Z103).
引文格式: 王峰,李小平,王茜.基于形式概念分析的模式匹配算法[J].东南大学学报:自然科学版,2009,39(1):34-39.
更新日期/Last Update: 2009-01-20