Learning a Markov Logic network for supervised gene regulatory network inferenceReport as inadecuate




Learning a Markov Logic network for supervised gene regulatory network inference - Download this document for free, or read online. Document in PDF available to download.

BMC Bioinformatics

, 14:273

Networks analysis

Abstract

BackgroundGene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class Regulation-No regulation to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks MLN that combine features of probabilistic graphical models with the expressivity of first-order logic rules.

ResultsWe propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate -regulates-, starting from a known gene regulatory network involved in the switch proliferation-differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a pairwise SVM while providing relevant insights on the predictions.

ConclusionsThe numerical studies show that MLN achieves very good predictive performance while opening the door to some interpretability of the decisions. Besides the ability to suggest new regulations, such an approach allows to cross-validate experimental data with existing knowledge.

Electronic supplementary materialThe online version of this article doi:10.1186-1471-2105-14-273 contains supplementary material, which is available to authorized users.

Download fulltext PDF



Author: Céline Brouard - Christel Vrain - Julie Dubois - David Castel - Marie-Anne Debily - Florence d’Alché-Buc

Source: https://link.springer.com/







Related documents