学科主题:  分析化学

题名:  A Simple TwoLevel Bayesian Network and its Application in the HPLCMS Data 
作者:  PingMa
; Lin XH(林晓惠)
; Yin PY(尹沛源)
; Xu GW(许国旺)

会议文集:  Proceeding of HPLC 2011

会议名称:  37th International Symposium on High Performance Liquid Phase Separations and Related Techniques

会议日期:  2011108

出版日期:  2011

会议地点:  大连

通讯作者:  许国旺

出版者:  待补充

出版地:  待补充

合作性质:  墙报

部门归属:  1808

主办者:  中国化学会色谱专业委员会

摘要:  Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide
application in many fields. But its structure learning is quite difficult, especially for
high dimensional data, such as liquid chromatographymass spectrometry (LCMS)
data. Since the number of the possible structures grows exponentially with the number
of variables.
To process LCMS data effectively, we applied an efficient BN structure learning
method which constructs a twolevel BN (BNTwoL). BNTwoL sets the class label C
as the root of the network. The variables which are conditional independent of each
other lie in different subtrees of the root. The subtrees of the root are generated
subsequently. First, the variable which is the most related to the class label in the
current candidate feature set AF is selected as the son node of the root. According to the
conditional independence assumption, the ion features which are very conditional
related to it on given the class label are chosen from AF to be its descendants. Once an
ion feature is selected to the network, it is removed from AF. The procedure is repeated
to construct the remaining subtrees until AF is empty.
Two metabonomics LCMS datasets about liver disease were analyzed by BNTwoL.
One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular
carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the
samples from human (26 H, 28 C and 45 T). The 10fold cross validation was run 100
times. The average accuracy rates for distinguishing the three liver diseases of the two
datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering
distinguishing between each two kinds of the liver disease, BNTwoL outperforms the
Naïve Bayesian, BNK2 (using K2 searching technique to search an optimal Bayesian
network structure from the possible structures space) and the BN proposed in [2] in
most cases. Meantime, this method could find the groups of the ion variables related to
the problem, which is meaningful for the metabonomics study. 
英文摘要:  Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide
application in many fields. But its structure learning is quite difficult, especially for
high dimensional data, such as liquid chromatographymass spectrometry (LCMS)
data. Since the number of the possible structures grows exponentially with the number
of variables.
To process LCMS data effectively, we applied an efficient BN structure learning
method which constructs a twolevel BN (BNTwoL). BNTwoL sets the class label C
as the root of the network. The variables which are conditional independent of each
other lie in different subtrees of the root. The subtrees of the root are generated
subsequently. First, the variable which is the most related to the class label in the
current candidate feature set AF is selected as the son node of the root. According to the
conditional independence assumption, the ion features which are very conditional
related to it on given the class label are chosen from AF to be its descendants. Once an
ion feature is selected to the network, it is removed from AF. The procedure is repeated
to construct the remaining subtrees until AF is empty.
Two metabonomics LCMS datasets about liver disease were analyzed by BNTwoL.
One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular
carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the
samples from human (26 H, 28 C and 45 T). The 10fold cross validation was run 100
times. The average accuracy rates for distinguishing the three liver diseases of the two
datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering
distinguishing between each two kinds of the liver disease, BNTwoL outperforms the
Naïve Bayesian, BNK2 (using K2 searching technique to search an optimal Bayesian
network structure from the possible structures space) and the BN proposed in [2] in
most cases. Meantime, this method could find the groups of the ion variables related to
the problem, which is meaningful for the metabonomics study. 
内容类型:  会议论文

URI标识:  http://casir.dicp.ac.cn/handle/321008/116077

Appears in Collections:  中国科学院大连化学物理研究所_会议论文

There are no files associated with this item.

Recommended Citation: 
PingMa,Lin XH,Yin PY,et al. A Simple TwoLevel Bayesian Network and its Application in the HPLCMS Data[C]. 见:37th International Symposium on High Performance Liquid Phase Separations and Related Techniques. 大连. 2011108.


