GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods

Numerous methods have been developed for reverse engineering gene regulatory networks from expression data. Unraveling and modeling these networks is of primary importance for improving our understanding of biological organisms. However, both their absolute and comparative performance remain poorly understood. The aim of this project is to provide benchmarks and tools for rigorous testing of methods for gene network inference.

Our framework is available as an open-source and user-friendly software called GeneNetWeaver (GNW). GNW is the first tool that provides methods for both in silico benchmark generation and performance profiling of network inference algorithms. GNW has been developed to easily generate detailed models of gene regulatory networks. One of the main advantages of using in silico is that perturbation experiments can be quickly and easily simulated to produce expression data unlike in vivo experiments, which are usually expensive and time consuming. Moreover, both quantity and quality of the expression data generated can be controled (e.g. by varying the amount of molecular and/or measurement noise). Finally expression data are used by inference methods to reconstruct (or reverse engineer) the underlying in silico networks, before quantitatively evaluating the performance of the methods by comparing target (unknown in in vivo experiments) and predicted networks.

Moreover, we have used GNW to organize three editions of the DREAM challenge, an annual community-wide network inference challenge. In this context, GNW was used to identify systematic errors of network inference algorithms, thus providing useful insights into how to improve their performance.

GNW has been developed during my PhD thesis. T Schaffter, From genes to organisms: Bioinformatics System Models and Software, 2014.

Performance evaluation of network inference methods

  1. In silico gene networks are obtained by extracting subnetwork structures from known transcriptional networks (E. coli, S. cerevisiae, etc.) before being endowed with detailed dynamical models of gene regulation accounting for both transcription and translation, independent and synergistic interactions, as well as molecular and measurement noise.
  2. In silico gene networks are simulated to produce steady-state and time-series expression data for a variety of experiments such as wild-type, knockout, knockdown, and multifactorial perturbation experiments.
  3. Inference methods are asked to predict structures of in silico benchmark networks from gene expression data.
  4. From network prediction files, GNW performs a network motif analysis which often reveals systematic prediction errors, thereby indicating potential ways of network reconstruction improvements. It also automatically generates comprehensive reports including standard metrics such as precision-recall and receiver operating characteristic (ROC) curves.

Video tutorial

Network inference challenges

The DREAM project is a Dialogue for Reverse Engineering Assessments and Methods. The main objective is to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology.

GNW has been used to organize the international DREAM3, DREAM4 and DREAM5 competitions. A total of 91 teams submitted about 900 network predictions to evaluate the performance of their method on GNW-generated benchmarks.

Other inference challenges:

  • The causality workbench: The purpose of this project is to provide an environment to test machine learning and causal discovery algorithms.

A community effort to assess biological network inference

The map shown below illustrates the international effort put by the community into assessing biological network inference. Hundreds of researchers have evaluated over 0'000 gene network predictions using GNW to obtain valuable insights into how they can improve the performance of inference methods.

The tools available in GNW for evaluating the accuracy of network structure predictions can also be applied for profiling the performance of neural, social, and technological network inference methods.

Copyright © 2017  Thomas Schaffter          Website: Thomas Schaffter