Knowledge Commons of Dalian Institute of Chemical Physics, CAS

学科主题 | 分析化学 |

A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data | |

PingMa; Lin XH(林晓惠); Yin PY(尹沛源); Xu GW(许国旺) | |

会议文集 | Proceeding of HPLC 2011 |

会议名称 | 37th International Symposium on High Performance Liquid Phase Separations and Related Techniques |

会议日期 | 2011-10-8 |

2011 | |

会议地点 | 大连 |

页码 | 497-0 |

出版者 | 待补充 |

出版地 | 待补充 |

合作性质 | 墙报 |

部门归属 | 1808 |

主办者 | 中国化学会色谱专业委员会 |

英文摘要 | Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study.; Bayesian network (BN) [1] is a very simple and efficient classifier, and has a wide application in many fields. But its structure learning is quite difficult, especially for high dimensional data, such as liquid chromatography-mass spectrometry (LC-MS) data. Since the number of the possible structures grows exponentially with the number of variables. To process LC-MS data effectively, we applied an efficient BN structure learning method which constructs a two-level BN (BN-TwoL). BN-TwoL sets the class label C as the root of the network. The variables which are conditional independent of each other lie in different sub-trees of the root. The sub-trees of the root are generated subsequently. First, the variable which is the most related to the class label in the current candidate feature set AF is selected as the son node of the root. According to the conditional independence assumption, the ion features which are very conditional related to it on given the class label are chosen from AF to be its descendants. Once an ion feature is selected to the network, it is removed from AF. The procedure is repeated to construct the remaining sub-trees until AF is empty. Two metabonomics LC-MS datasets about liver disease were analyzed by BN-TwoL. One includes the samples from rats (20 hepatitis (H) samples, 20 hepatocellular carcinoma (C) samples and 20 liver cirrhosis (T) samples), the other contains the samples from human (26 H, 28 C and 45 T). The 10-fold cross validation was run 100 times. The average accuracy rates for distinguishing the three liver diseases of the two datasets were 84.05±2.73% and 88.79±1.52 %, respectively. In addition, considering distinguishing between each two kinds of the liver disease, BN-TwoL outperforms the Naïve Bayesian, BN-K2 (using K2 searching technique to search an optimal Bayesian network structure from the possible structures space) and the BN proposed in [2] in most cases. Meantime, this method could find the groups of the ion variables related to the problem, which is meaningful for the metabonomics study. |

文献类型 | 会议论文 |

条目标识符 | http://cas-ir.dicp.ac.cn/handle/321008/116077 |

专题 | 中国科学院大连化学物理研究所 |

通讯作者 | Xu GW(许国旺) |

推荐引用方式GB/T 7714 | PingMa,Lin XH,Yin PY,et al. A Simple Two-Level Bayesian Network and its Application in the HPLC-MS Data[C]. 待补充:待补充,2011:497-0. |

条目包含的文件 |
| |||||

条目无相关文件。 |

个性服务 |

推荐该条目 |

保存到收藏夹 |

查看访问统计 |

导出为Endnote文件 |

谷歌学术 |

谷歌学术中相似的文章 |

[PingMa]的文章 |

[林晓惠]的文章 |

[尹沛源]的文章 |

百度学术 |

百度学术中相似的文章 |

[PingMa]的文章 |

[林晓惠]的文章 |

[尹沛源]的文章 |

必应学术 |

必应学术中相似的文章 |

[PingMa]的文章 |

[林晓惠]的文章 |

[尹沛源]的文章 |

相关权益政策 |

暂无数据 |

收藏/分享 |

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。

## 修改评论