Towards unsupervised and systematic segmentation of biological organisms

High-throughput microscopy imaging applications represent an important research field that will one day provide tools for automatic quantification of living organisms at a multiscale systems level (molecular, cellular, and tissue level). In the context of the WingX project, a initiative, we developed methods and computational tools for unsupervised segmentation of the Drosophila wing.

Our first contribution is the development of a novel image processing algorithm to automatically infer a model of the morphological structure of the developing Drosophila wing (also called wing pouch) from confocal fluorescence images. In addition to provide valuable information on the morphology of the wing, we use the parametric structure model for systematic spatial quantification of gene expression. In order to complete morphological and expression data sets, we implemented an unsupervised 3D cell nuclei detection method. As an example, we applied this method to count the number of nuclei in wild type and pentagone deficient wings to study the effect of the mutation on the growth rate of the wing. Finally, we developed a method for integrating morphological and expression data sets collected from many individuals and then generate a reliable and robust quantitative description of the biological system of interest.

New! An updated version of the user manual is available here.

WingJ has been developed during my PhD thesis. T Schaffter, From genes to organisms: Bioinformatics System Models and Software, 2014.

Development of the Drosophila wing

Show description

The adult appendages of Drosophila such the wings, legs, antennae and halteres (the balancing organs of the fly) are derived from imaginal discs (Morata, 2001). These imaginal discs form in the embryo as a small cluster of cells (Martín et al., 2009). During the growth phase, the imaginal discs are mainly composed of a single-layered sheet of columnar cells, which is contiguous to another layer of cells called the peripodial membrane. There are two discrete stages that metamorphose the discs during larval development. During growth, patterning of the discs is dependent on secreted molecules called morphogens. Examples of morphogens include Decapentaplegic (Dpp) and Wingless (Wg) which are both required for proper development of the wing imaginal discs (Lawrence et al., 1996; Affolter et al., 2007). The number of cells increase by approximately a 1000 fold in four days (during the larval period) to reach nearly 50'000 cells at a stage of development called late third instar (Martín et al., 2009).

The wing disc illustrated below includes a region called the wing pouch. In the adult wing, the wing pouch gives rise to the wing blade while the part surrounding it (called hinge) forms a flexible link attaching the wing blade to the body wall of the fly. The wing pouch is divided by the anterior/posterior (A/P) and dorsal/ventral (D/V) compartment boundaries into four compartments (García-Bellido et al., 1973). The relation between these four compartments of the pouch and the adult wing is introduced in the attached video. We use the expression of Wingless (Wg) detected with antibodies to visualize the contour of the wing pouch and the D/V boundary. In a similar way, the A/P boundary can be identified via the expression of Patched (Ptc). During eversion, the single-cell layered wing pouch everts to give rise to the double-layered adult wing.

Unsupervised segmentation of the Drosophila wing pouch

The first goal of this project was to develop an unsupervised detection and segmentation method for quantifying the morphological structure of the Drosophila wing pouch. We take as input stacks of fluorescence images (3D images) where the structure of the wing pouch is visualized by staining the expression of two proteins called Wingless (Wg) and Patch (Ptc) using fluorescent markers. We then apply a suite of fully automated image-processing detection modules to identify specific features of the pouch. A parametric model that describes the morphological structure of the pouch (including its orientation in the image space) is reconstructed by integrating the output of the detection modules.

We show below three wing discs imaged at 80, 90 and 100 hours after egg laying (AEL). The first panel shows the maximum intensity projection of a stack of confocal images where the expression of Wg-Ptc is stained using fluorescent markers. The second panel shows the parametric structure model that our method has automatically identified, which can be locally edited by moving the control points '+'. Finally, the third panel reports the validated structure model after automatic orientation inference. Here DA corresponds to the dorsal-anterior compartment, for instance.

In order to provide users with the full control of the segmentation process, the quantification of the morphological structure of the wing pouch can be performed step by step. Using this mode, the output of each detection module can be supervised and edited if required. This mode is particularly useful to understand why a set of images could not be sucessfully processed. The same mode enables to apply successively a given detection module with different parameter values.

Unsupervised detection of the Drosophila embryo

Because the Drosophila embryo is a largely-studied system model, we also implemented a method for enabling the automatic detection and segmentation of the embryo structure. This method reuses some of the image-processing detection modules that we have developed previously to identify the structure of the Drosophila wing pouch.

Here, the detection modules are: 1) detection of the contour of the embryo (using active contours), 2) detection of the dorsal-ventral and anterior-posterior axes, 3) automatic inference of the embryo orientation and 4) possibility to manually edit the inferred structure model. We show below two examples of embryo structure quantification.

Systematic spatial expression quantification

Today, many studies focus on one spatial dimension before increasing the complexity of their models. Examples include the scaling of signaling activity with the size of growing tissues (Hamaratoglu et al., 2011) and the reverse engineering of developmental gene regulatory networks (Jaeger et al., 2011; Perkins et al., 2006). To measure gene and protein expression in one spatial dimension, we derive the structure models of the individual systems previously inferred to generate systematic trajectories. The obtained data sets are called expression profiles.

Only two parameters are required to generate a trajectory: the reference boundary (A/P or D/V) and a translation offset. The last case below shows the trajectory obtained when translating the D/V boundary along 45% of the trajectory of the C-D axis (C is the center of the model, D is the intersection of the A/P boundary with the dorsal part of the outer boundary of the model). Note that placing the offset slider on -45% would have translated the D/V boundary towards the ventral part of the structure.

However, we wanted to support the quantification of gene and protein expression everywhere inside the inferred structure model. If the generation of expression profiles is relatively evident, the principal challenge was to move from a 1D to 2D discretization of the system. To overcome this, we defined a coordinate system derived from the morphological structure model identified previously by our unsupervised segmentation method. Grid representations of this coordinate system are shown below for Drosophila wings imaged at different times during during development. Here, we used this representation to obtain a quantitative description of the effect of the pentagone mutation on the morphology of developing wing.

The advantage of this coordinate system is that it provides a way to measure expression levels in a systematic way in many individual systems. Here, we used this method to measure the relative expression of five gene products at different time steps during wing development. Expression is measured at the intersection points of the grids, whose resolution can be freely adjusted, before generating for each individual system an intermediate representation called circular expression map.

Thanks to this representation, spatial gene expression data can be integrated despite the fact that they have been collected from individual systems having intrinsctly different shapes. The expression maps of Pmad-GFP, Pmad-AB, Brk-AB, Sal-AB and Omb-AB shown below correspond each to an average expression map computed from five to ten wings.

In addition to provide a tool that can be directly used to study the scaling of activity gradients in growing tissues, this representation can be used to visualize and compare expression maps obtained under different experimental conditions. Examples include systems imaged at different time steps during development or systems with different genetic backgrounds.

Reliable and robust quantitative description

Another contribution of this project is the integration of phenotypic and expression data collected from many individual systems. The goal is to obtain a single and reliable multiscale model from many systems generated under the same experimental conditions. We envision that such quantitative descriptions will be very useful in scientific research, in particular for reverse engineering predictive multiscale models of biological organisms.

We provide the following interface to easily visualize quantitative descriptions of the developing Drosophila wing that we have generated using the systematic method described above. The data shown can be used to gain insight into the domains of expression of different gene products in wild type and pentagone deficient wings at different time steps during development. The gene products are respectively associated to the red, green and blue image channels to produce RGB images.

For each experimental condition, the shape of the wing has been generated from the individual structure models of fifteen to thirty wings. The dashed line represents the standard deviation of the structure models. Moreover, the domain of expression of each gene product corresponds to a mean expression map computed from five to ten individual wings. Note that after having quantified the individual systems, the generation of phenotypic and expression integrated models is done automatically using our software.

Wild type

80 h AEL
90 h
100 h
110 h

Please wait while images are loading.

Click on the image to enlarge it. Right-click on the image and select Save link as... to save the image in high quality. Anterior is left and dorsal is top (canonical orientation).

Automatic 3D cell nuclei detection

To provide information on the cellular level, we developed a fully-automated algorithm for 3D cell nuclei detection and segmentation in stacks of confocal images where nuclei are tagged with fluorescent fluorescent proteins.

In our experiment, we used TO-PRO-3 (a fluorescent dye) to make the cell nuclei visible in the wing imaginal disc. To accurately detect the nuclei that are included in the wing pouch, the structure model previously inferred is used to define a volume of interest inside the space of the image stack. Because the wing pouch is a relatively flat single-cell layered tissue before wing eversion, its structure is described by a 2D model. Here, this model is protruded along the z-axis of the image space to define the volume of interest.

The following video shows the output of the unsupervised algorithm where cell nuclei are labelled with different colors. The individual images used to reconstruct the nuclei in 3D have been exported using our application. The fully-automated 3D cell nuclei detection method is available in the WingJ Matlab toolbox. The 3D rendering of the video has been obtained using Imaris.

Statistical analysis & gene network inference

Finally, we provide researchers with Matlab tools to help organizing and analyzing the data sets exported using WingJ (object-oriented implementation).

The first set of tools is dedicated to the analysis of structure or phenotypic data and can be used to quickly perform statistical tests and plot data (Fig. 1a). A second package implements a network inference method for making predictions about the structure of gene regulatory networks using expression maps, that is, 2D and possibly 3D spatial expression data (Fig. 1b).

The last application included in the WingJ Matlab toolbox has already been introduced and enables unsupervised 3D cell nuclei detection.

Systems supported for unsupervised segmentation

You would like to develop an automatic structure detection method for another system or have a suggestion to improve WingJ?

Copyright © 2017  Thomas Schaffter          Website: Thomas Schaffter