[1]龚乐君,韦有兵,谢建明,等.一种面向基因与疾病关系的文本挖掘方法[J].东南大学学报(自然科学版),2010,40(3):486-490.[doi:10.3969/j.issn.1001-0505.2010.03.010]
 Gong Lejun,Wei Youbing,Xie Jianming,et al.Text mining approach for relationships between genes and diseases[J].Journal of Southeast University (Natural Science Edition),2010,40(3):486-490.[doi:10.3969/j.issn.1001-0505.2010.03.010]
点击复制

一种面向基因与疾病关系的文本挖掘方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
40
期数:
2010年第3期
页码:
486-490
栏目:
计算机科学与工程
出版日期:
2010-05-20

文章信息/Info

Title:
Text mining approach for relationships between genes and diseases
作者:
龚乐君12 韦有兵1 谢建明1 袁志栋1 孙啸1
1 东南大学生物电子学国家重点实验室, 南京 210096; 2 淮阴工学院计算机工程学院, 淮安 223003
Author(s):
Gong Lejun12 Wei Youbing1 Xie Jianming1 Yuan Zhidong1 Sun Xiao1
1 State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China
2 School of Computer Engineering, Huaiyin Institute of Technology, Huaian 223003, China
关键词:
生物医学 文本挖掘 关系抽取 实体识别
Keywords:
biomedicine text mining relation extraction entity recognition
分类号:
TP391
DOI:
10.3969/j.issn.1001-0505.2010.03.010
摘要:
结合模式匹配、生物医学本体及共现技术,设计了一种自动抽取基因与疾病、基因与基因之间关系的文本挖掘方法,并开发了一个可以处理海量文本数据的系统.该系统可抽取与疾病相关的基因实体,挖掘基因与疾病、基因与基因之间的关系,衡量基因与疾病实体的相关性,并为分析基因与疾病、基因与基因之间的关系提供了网络可视化工具.实验结果表明,系统在测试数据集上抽取基因与疾病之间的关系可获得83.0%的综合测评率,抽取基因与基因之间的关系可获得78.5%的综合测评率.该系统已成功应用于乳腺癌及相关基因的研究.
Abstract:
A text mining approach is designed for automatically extracting the relationships between genes and diseases and those between genes and genes by combining pattern match and biomedical ontology with co-occurrence techniques. And a system is developed for processing large-scale text datasets. The system can extract gene entities related to diseases, mine the relationships between genes and diseases and those between genes and genes, and rank the relevance of the relationships between genes and diseases. Moreover, network visualization tools are provided for analyzing the relationships between genes and diseases and those between genes and genes. The experimental results show an F-score of 83.0% can be achieved for the extraction of the relationships between genes and diseases, and an F-score of 78.5% can be obtained for the extraction of the relationships between genes for the test datasets. This system is successfully applied to the researches about breast cancer and related genes.

参考文献/References:

[1] Cohen A M,Hersh W R.A survey of current work in biomedical text mining[J]. Brief Bioinform,2005,6(1):57-71.
[2] Jenssen T K,Laegreid A,Komorowski J,et al.A literature network of human genes for high-throughput analysis of gene expression[J].Nat Genet,2001,28(1):21-28.
[3] Fernández J M,Hoffmann R,Valencia A.iHop web services[J].Nucleic Acids Res,2007,35(Sup1):W21-W26.
[4] Rebholz-Schuhmann D,Kirsch H,Arregui M,et al.EBIMed—text crunching to gather facts for proteins from Medline[J]. Bioinformatics,2007,23(2):E237-E244.
[5] Muin M,Fontelo P.Technical development of PubMed interact:an improved interface for Medline/PubMed searches[J]. BMC Med Inform Decis Mak,2006(6):36.
[6] Wain H M,Bruford E A,Lovering R C,et al.Guidelines for human gene nomenclature[J].Genomics,2002,79(4):464-470.
[7] Swanson D R.Complementary structures in disjoint science literatures[C] //Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Chicago,Illinois,USA,1991:280-289.
[8] Sinilnikova O M,Antoniou A C,Simard J,et al.The TP53 Arg72Pro and MDM2 309G>T polymorphisms are not associated with breast cancer risk in BRCA1 and BRCA2 mutation carriers[J]. Br J Cancer,2009,101(8):1456-60.
[9] Honrado E,Benítez J,Palacios J.Histopathology of BRCA1- and BRCA2-associated breast cancer[J]. Crit Rev Oncol Hematol,2006,59(1):27-39.
[10] Parmigiani G,Berry D,Aguilar O.Determining carrier probabilities for breast cancer-susceptibility genes BRCA1 and BRCA2[J].Am J Hum Genet,1998,62(1):145-158.
[11] Seth A,Palli D,Mariano J M,et al.P53 gene mutations in women with breast cancer and a previous history of benign breast disease[J].Eur J Cancer,1994,30A(6):808-812.
[12] Sunpaweravong S,Sunpaweravong P.Recent developments in critical genes in the molecular biology of breast cancer[J]. Asian J Surg,2005,28(1):71-75.
[13] Fundel K,Küffner R,Zimmer R.RelEx — relation extraction using dependency parse trees[J].Bioinformatics,2007,23(3):365-371.

相似文献/References:

[1]高岭,申元,高妮,等.基于文本挖掘的漏洞信息聚类分析[J].东南大学学报(自然科学版),2015,45(5):845.[doi:10.3969/j.issn.1001-0505.2015.05.006]
 Gao Ling,Shen Yuan,Gao Ni,et al.Clustering analysis of vulnerability information based on text mining[J].Journal of Southeast University (Natural Science Edition),2015,45(3):845.[doi:10.3969/j.issn.1001-0505.2015.05.006]

备注/Memo

备注/Memo:
作者简介: 龚乐君(1978—),女,博士生; 孙啸(联系人),男,博士,教授,博士生导师,xsun@seu.edu.cn.
基金项目: 国家自然科学基金资助项目(60771024).
引文格式: 龚乐君,韦有兵,谢建明,等.一种面向基因与疾病关系的文本挖掘方法[J].东南大学学报:自然科学版,2010,40(3):486-490. [doi:10.3969/j.issn.1001-0505.2010.03.010]
更新日期/Last Update: 2010-05-20