中国科学院大连化学物理研究所机构知识库
Advanced  
DICP OpenIR  > 中国科学院大连化学物理研究所  > 会议论文
学科主题: 分析化学
题名: A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data
作者: PingMa ;  Lin XH(林晓惠) ;  Yin PY(尹沛源) ;  Xu GW(许国旺)
会议文集: Proceeding of HPLC 2011
会议名称: 37th International Symposium on High Performance Liquid Phase Separations and Related Techniques
会议日期: 2011-10-8
出版日期: 2011
会议地点: 大连
通讯作者: 许国旺
出版者: 待补充
出版地: 待补充
合作性质: 墙报
部门归属: 1808
主办者: 中国化学会色谱专业委员会
摘要: Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study.
英文摘要: Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study.
内容类型: 会议论文
URI标识: http://cas-ir.dicp.ac.cn/handle/321008/116077
Appears in Collections:中国科学院大连化学物理研究所_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
PingMa,Lin XH,Yin PY,et al. A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data[C]. 见:37th International Symposium on High Performance Liquid Phase Separations and Related Techniques. 大连. 2011-10-8.
Service
 Recommend this item
 Sava as my favorate item
 Show this item's statistics
 Export Endnote File
Google Scholar
 Similar articles in Google Scholar
 [PingMa]'s Articles
 [林晓惠]'s Articles
 [尹沛源]'s Articles
CSDL cross search
 Similar articles in CSDL Cross Search
 [PingMa]‘s Articles
 [林晓惠]‘s Articles
 [尹沛源]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
  Add to CiteULike  Add to Connotea  Add to Del.icio.us  Add to Digg  Add to Reddit 
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Powered by CSpace