DICP OpenIR
Subject Area分析化学
Feature Grouping Technique Based on Biclustering for the Analysis of LC-MS Metabolomic Data
Lin XH(林晓惠); Ruan Q(阮强); Zhou LN(周丽娜); Yin PY(尹沛源); Xu GW(许国旺)
Source PublicationProceeding of HPLC 2011
Conference Name37th International Symposium on High Performance Liquid Phase Separations and Related Techniques
Conference Date2011-10-8
2011
Conference Place大连
Pages492-0
Publisher待补充
Publication Place待补充
Cooperation Status墙报
Department1808
Funding Organization中国化学会色谱专业委员会
AbstractMetabolomics has shown a promising application in many fields such as disease diagnosis, drug development. Liquid chromatography-mass spectrometry (LC-MS) is one of its main analysis techniques. HPLC-MS metabolomic data is usually of high dimension. Among the large features, some are related to each other and contain the similar information about the problem. Grouping the features correctly is very meaningful for getting a comprehension of the problem studied and building a more efficient classification model. In this work, a LC-MS dataset which contains serum specimens from 30 normal samples, 30 hepatitis patients (H), 30 cirrhosis patients (C) and 30 liver cancers patients (T) was got. To distinguish the different kinds of the liver disease, we proposed an ensemble classification method based on the feature grouping by the biclustering [1] technique (EC-BicFG). For each base classifier, the feature subspace is generated according to the group ranking. Naive Bayes (NB) and 5-nearest-neighbor (5NN) are adopted as the base classifiers, respectively. In addition to discriminating between controls and patients, we also conducted the experiments to distinguish among three liver diseases, and between each two kinds of the liver diseases. The corresponding accuracy rates are listed in Table 1. It shows that our method out performs EGSG [2] which is also an ensemble classification algorithm based on feature grouping. Table 1 The LOOCV classification accuracy rates NB (%) 5NN (%) EGSG EC-BicFG EGSG EC-BicFG H vs. C 80.50(2.61) 87.67(1.61) 71.99(5.49) 84.50(1.77) H vs. T 76.50(5.41) 86.33(2.58) 60.00(5.21) 77.67(1.41) C vs. T 76.50(4.93) 81.00(2.63) 54.00(3.78) 83.83(1.24) H vs. C vs. T 62.33(5.84) 73.11(0.47) 54.33(3.41) 72.11(1.77) Control vs. Model 95.42(1.85) 96.83(1.61) 89.75(2.69) 97.42(1.07) REFERENCE [1] Yizong Cheng, George M. Church. Biclustering of expression data. In Proceedings of 8th International Conference on Intelligent System for Molecular Biology (ISMB) (2000) 93–103. [2] Huawen Liu, Lei Liu, Huijie Zhang. Ensemble gene selection by grouping for microarray data classification. Journal of Biomedical Informatics 43 (2010) 81–87.; Metabolomics has shown a promising application in many fields such as disease diagnosis, drug development. Liquid chromatography-mass spectrometry (LC-MS) is one of its main analysis techniques. HPLC-MS metabolomic data is usually of high dimension. Among the large features, some are related to each other and contain the similar information about the problem. Grouping the features correctly is very meaningful for getting a comprehension of the problem studied and building a more efficient classification model. In this work, a LC-MS dataset which contains serum specimens from 30 normal samples, 30 hepatitis patients (H), 30 cirrhosis patients (C) and 30 liver cancers patients (T) was got. To distinguish the different kinds of the liver disease, we proposed an ensemble classification method based on the feature grouping by the biclustering [1] technique (EC-BicFG). For each base classifier, the feature subspace is generated according to the group ranking. Naive Bayes (NB) and 5-nearest-neighbor (5NN) are adopted as the base classifiers, respectively. In addition to discriminating between controls and patients, we also conducted the experiments to distinguish among three liver diseases, and between each two kinds of the liver diseases. The corresponding accuracy rates are listed in Table 1. It shows that our method out performs EGSG [2] which is also an ensemble classification algorithm based on feature grouping. Table 1 The LOOCV classification accuracy rates NB (%) 5NN (%) EGSG EC-BicFG EGSG EC-BicFG H vs. C 80.50(2.61) 87.67(1.61) 71.99(5.49) 84.50(1.77) H vs. T 76.50(5.41) 86.33(2.58) 60.00(5.21) 77.67(1.41) C vs. T 76.50(4.93) 81.00(2.63) 54.00(3.78) 83.83(1.24) H vs. C vs. T 62.33(5.84) 73.11(0.47) 54.33(3.41) 72.11(1.77) Control vs. Model 95.42(1.85) 96.83(1.61) 89.75(2.69) 97.42(1.07) REFERENCE [1] Yizong Cheng, George M. Church. Biclustering of expression data. In Proceedings of 8th International Conference on Intelligent System for Molecular Biology (ISMB) (2000) 93–103. [2] Huawen Liu, Lei Liu, Huijie Zhang. Ensemble gene selection by grouping for microarray data classification. Journal of Biomedical Informatics 43 (2010) 81–87.
Document Type会议论文
Identifierhttp://cas-ir.dicp.ac.cn/handle/321008/116075
Collection中国科学院大连化学物理研究所
Corresponding AuthorXu GW(许国旺)
Recommended Citation
GB/T 7714
Lin XH,Ruan Q,Zhou LN,et al. Feature Grouping Technique Based on Biclustering for the Analysis of LC-MS Metabolomic Data[C]. 待补充:待补充,2011:492-0.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[林晓惠]'s Articles
[阮强]'s Articles
[周丽娜]'s Articles
Baidu academic
Similar articles in Baidu academic
[林晓惠]'s Articles
[阮强]'s Articles
[周丽娜]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[林晓惠]'s Articles
[阮强]'s Articles
[周丽娜]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.