[1]陈叶旺,李文,彭鑫,等.基于本体的文档语义标注改进方法[J].东南大学学报(自然科学版),2009,39(6):1109-1113.[doi:10.3969/j.issn.1001-0505.2009.06.005]
 Chen Yewang,Li Wen,Peng Xin,et al.Improved semantic annotation method for documents based on ontology[J].Journal of Southeast University (Natural Science Edition),2009,39(6):1109-1113.[doi:10.3969/j.issn.1001-0505.2009.06.005]
点击复制

基于本体的文档语义标注改进方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
39
期数:
2009年第6期
页码:
1109-1113
栏目:
计算机科学与工程
出版日期:
2009-11-20

文章信息/Info

Title:
Improved semantic annotation method for documents based on ontology
作者:
陈叶旺 李文 彭鑫 赵文耘
复旦大学计算机科学技术学院, 上海 200433
Author(s):
Chen Yewang Li Wen Peng Xin Zhao Wenyun
School of Computer Science, Fudan University, Shanghai 200433,China
关键词:
本体 语义环境 语义标注
Keywords:
ontology semantic context semantic annotation
分类号:
TP301
DOI:
10.3969/j.issn.1001-0505.2009.06.005
摘要:
在领域本体知识的语义环境和资源文档结构基础上,提出一种文档语义标注改进方法,分析、计算标签-文档的词频相关性和语义环境在局部窗口的共现性,实现对各类文档资源的语义标注.该方法首先提取出文档资源的纯文本内容,并分解出子句、句和段落集合.然后,对于每个具体的领域知识项,在本体知识库中寻找其语义环境信息.最后,按照7条相关度规则,分别计算出这些信息与分解后文档内容的相关度,从而完成整个文档库内和知识库内的综合计算,得到该项知识与文档资源的最终相关度.实验结果显示,该方法能够依据领域本体,有效地对互联网中大量以网页等形式存在的多种类文档知识资源进行自动语义标注.
Abstract:
Based on the semantic context and the structural info of a document, an improved semantic annotation method is proposed. The correlation between the ontology entity and the document and the co-appearance of the label-words frequents and the semantic context in local window are analysed and calculated. Firstly, this method extracts the text content from the document, and then decomposes it into a sub-sentences set, a sentences set and a paragraphs set. For each knowledge item in ontology, the context information of the item is extracted, and then the correlation between these information and those decomposed documents sets is calculated. Finally, the final correlation between the knowledge item and the document in the range of all document base and ontology base are obtained. The experimental results show that based on domain ontology, this method can annotate unstructured documents in web automatically and effectively.

参考文献/References:

[1] Berners Lee T,Hendler J,Lassila O.The semantic web [J].Scientific American Magazine,2001,284(5):28-37.
[2] Ciravegna F,Wilks Y.Designing adaptive information extraction for the semantic Web in Amilcare[C] //Annotation for the Semantic Web,Frontiers in Artificial Intelligence and Applications Amsterdam.Amsterdam,Netherlands:IOS Press,2003:112-127.
[3] Alani H,Kim S,Millard D,et al.Automatic ontology-based knowledge extraction from Web documents[J].Intelligent Systems,2003,18(1):14-21.
[4] Lai Y,Wang R.Towards automatic knowledge acquisition from text based on ontology-centric knowledge representation and acquisition[C] //Proceedings of the KCAP Workshop on Knowledge Markup and Semantic Annotation.Sanibel,FL,USA,2003:111-127.
[5] Schutz A,Buitelaar P.RelExt:a tool for relation extraction from text in ontology extension[C] //Proceedings of the 4th International Semantic Web Conference.Berlin:Springer,2005:593-606.
[6] Vallet D,Fernández M,Castells P.An ontology-based information retrieval model[C] //Proceedings of the 2nd European Semantic Web Conference.Heraklion,Greece,2005:455-470.
[7] Xu J X,Croft W B.Improving the effectiveness of information retrieval with local context analysis[J].ACM Transactions on Information Systems,2000,18(1):79-112.
[8] 张敏,宋睿华,马少平.基于语义关系查询扩展的文档重构方法[J].计算机学报,2004,27(10):1395-1401.
  Zhang Min,Song Ronghua,Ma Shaoping.Document refinement based on semantic query expansion[J].Chinese Journal of Computers,2004,27(10):1395-1401.(in Chinese)
[9] Chang Y,Ounis I,Kim M.Query reformulation using automatically generated query concepts from a document space[J]. Information Processing and Management,2006,42(2):453-468.

相似文献/References:

[1]张祥,李星,温韵清,等.语义网虚拟本体构建[J].东南大学学报(自然科学版),2015,45(4):652.[doi:10.3969/j.issn.1001-0505.2015.04.007]
 Zhang Xiang,Li Xing,Wen Yunqing,et al.Building virtual ontologies in semantic web[J].Journal of Southeast University (Natural Science Edition),2015,45(6):652.[doi:10.3969/j.issn.1001-0505.2015.04.007]
[2]严波,薛澄岐,姚干勤.基于本体的虚拟会展产品信息知识表达[J].东南大学学报(自然科学版),2016,46(1):42.[doi:10.3969/j.issn.1001-0505.2016.01.008]
 Yan Bo,Xue Chengqi,Yao Ganqin.Knowledge expression of virtual exhibition product information based on ontology[J].Journal of Southeast University (Natural Science Edition),2016,46(6):42.[doi:10.3969/j.issn.1001-0505.2016.01.008]

备注/Memo

备注/Memo:
作者简介: 陈叶旺(1978—),男,博士生; 赵文耘(联系人); 男,教授,博士生导师,wyzhao@fudan.edu.cn.
基金项目: 国家高技术研究发展计划(863计划)资助项目(2007AA01Z179).
引文格式: 陈叶旺,李文,彭鑫,等.基于本体的文档语义标注改进方法[J].东南大学学报:自然科学版,2009,39(6):1109-1113. [doi:10.3969/j.issn.1001-0505.2009.06.005]
更新日期/Last Update: 2009-11-20