DICP OpenIR
学科主题分析化学
Feature Grouping Technique Based on Biclustering for the Analysis of LC-MS Metabolomic Data
Lin XH(林晓惠); Ruan Q(阮强); Zhou LN(周丽娜); Yin PY(尹沛源); Xu GW(许国旺)
会议文集Proceeding of HPLC 2011
会议名称37th International Symposium on High Performance Liquid Phase Separations and Related Techniques
会议日期2011-10-8
2011
会议地点大连
页码492-0
出版者待补充
出版地待补充
合作性质墙报
部门归属1808
主办者中国化学会色谱专业委员会
英文摘要Metabolomics has shown a promising application in many fields such as disease diagnosis, drug development. Liquid chromatography-mass spectrometry (LC-MS) is one of its main analysis techniques. HPLC-MS metabolomic data is usually of high dimension. Among the large features, some are related to each other and contain the similar information about the problem. Grouping the features correctly is very meaningful for getting a comprehension of the problem studied and building a more efficient classification model. In this work, a LC-MS dataset which contains serum specimens from 30 normal samples, 30 hepatitis patients (H), 30 cirrhosis patients (C) and 30 liver cancers patients (T) was got. To distinguish the different kinds of the liver disease, we proposed an ensemble classification method based on the feature grouping by the biclustering [1] technique (EC-BicFG). For each base classifier, the feature subspace is generated according to the group ranking. Naive Bayes (NB) and 5-nearest-neighbor (5NN) are adopted as the base classifiers, respectively. In addition to discriminating between controls and patients, we also conducted the experiments to distinguish among three liver diseases, and between each two kinds of the liver diseases. The corresponding accuracy rates are listed in Table 1. It shows that our method out performs EGSG [2] which is also an ensemble classification algorithm based on feature grouping. Table 1 The LOOCV classification accuracy rates NB (%) 5NN (%) EGSG EC-BicFG EGSG EC-BicFG H vs. C 80.50(2.61) 87.67(1.61) 71.99(5.49) 84.50(1.77) H vs. T 76.50(5.41) 86.33(2.58) 60.00(5.21) 77.67(1.41) C vs. T 76.50(4.93) 81.00(2.63) 54.00(3.78) 83.83(1.24) H vs. C vs. T 62.33(5.84) 73.11(0.47) 54.33(3.41) 72.11(1.77) Control vs. Model 95.42(1.85) 96.83(1.61) 89.75(2.69) 97.42(1.07) REFERENCE [1] Yizong Cheng, George M. Church. Biclustering of expression data. In Proceedings of 8th International Conference on Intelligent System for Molecular Biology (ISMB) (2000) 93–103. [2] Huawen Liu, Lei Liu, Huijie Zhang. Ensemble gene selection by grouping for microarray data classification. Journal of Biomedical Informatics 43 (2010) 81–87.; Metabolomics has shown a promising application in many fields such as disease diagnosis, drug development. Liquid chromatography-mass spectrometry (LC-MS) is one of its main analysis techniques. HPLC-MS metabolomic data is usually of high dimension. Among the large features, some are related to each other and contain the similar information about the problem. Grouping the features correctly is very meaningful for getting a comprehension of the problem studied and building a more efficient classification model. In this work, a LC-MS dataset which contains serum specimens from 30 normal samples, 30 hepatitis patients (H), 30 cirrhosis patients (C) and 30 liver cancers patients (T) was got. To distinguish the different kinds of the liver disease, we proposed an ensemble classification method based on the feature grouping by the biclustering [1] technique (EC-BicFG). For each base classifier, the feature subspace is generated according to the group ranking. Naive Bayes (NB) and 5-nearest-neighbor (5NN) are adopted as the base classifiers, respectively. In addition to discriminating between controls and patients, we also conducted the experiments to distinguish among three liver diseases, and between each two kinds of the liver diseases. The corresponding accuracy rates are listed in Table 1. It shows that our method out performs EGSG [2] which is also an ensemble classification algorithm based on feature grouping. Table 1 The LOOCV classification accuracy rates NB (%) 5NN (%) EGSG EC-BicFG EGSG EC-BicFG H vs. C 80.50(2.61) 87.67(1.61) 71.99(5.49) 84.50(1.77) H vs. T 76.50(5.41) 86.33(2.58) 60.00(5.21) 77.67(1.41) C vs. T 76.50(4.93) 81.00(2.63) 54.00(3.78) 83.83(1.24) H vs. C vs. T 62.33(5.84) 73.11(0.47) 54.33(3.41) 72.11(1.77) Control vs. Model 95.42(1.85) 96.83(1.61) 89.75(2.69) 97.42(1.07) REFERENCE [1] Yizong Cheng, George M. Church. Biclustering of expression data. In Proceedings of 8th International Conference on Intelligent System for Molecular Biology (ISMB) (2000) 93–103. [2] Huawen Liu, Lei Liu, Huijie Zhang. Ensemble gene selection by grouping for microarray data classification. Journal of Biomedical Informatics 43 (2010) 81–87.
文献类型会议论文
条目标识符http://cas-ir.dicp.ac.cn/handle/321008/116075
专题中国科学院大连化学物理研究所
通讯作者Xu GW(许国旺)
推荐引用方式
GB/T 7714
Lin XH,Ruan Q,Zhou LN,et al. Feature Grouping Technique Based on Biclustering for the Analysis of LC-MS Metabolomic Data[C]. 待补充:待补充,2011:492-0.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[林晓惠]的文章
[阮强]的文章
[周丽娜]的文章
百度学术
百度学术中相似的文章
[林晓惠]的文章
[阮强]的文章
[周丽娜]的文章
必应学术
必应学术中相似的文章
[林晓惠]的文章
[阮强]的文章
[周丽娜]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。