DICP OpenIR
学科主题分析化学
A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data
PingMa; Lin XH(林晓惠); Yin PY(尹沛源); Xu GW(许国旺)
会议文集Proceeding of HPLC 2011
会议名称37th International Symposium on High Performance Liquid Phase Separations and Related Techniques
会议日期2011-10-8
2011
会议地点大连
页码497-0
出版者待补充
出版地待补充
合作性质墙报
部门归属1808
主办者中国化学会色谱专业委员会
英文摘要Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study.; Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study.
文献类型会议论文
条目标识符http://cas-ir.dicp.ac.cn/handle/321008/116077
专题中国科学院大连化学物理研究所
通讯作者Xu GW(许国旺)
推荐引用方式
GB/T 7714
PingMa,Lin XH,Yin PY,et al. A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data[C]. 待补充:待补充,2011:497-0.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[PingMa]的文章
[林晓惠]的文章
[尹沛源]的文章
百度学术
百度学术中相似的文章
[PingMa]的文章
[林晓惠]的文章
[尹沛源]的文章
必应学术
必应学术中相似的文章
[PingMa]的文章
[林晓惠]的文章
[尹沛源]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。