A screenshot of the network of predicted functional modules

A key feature of the biological organization in all organisms is the tendency of proteins that function in common pathways to physically associate via stable protein-protein interactions (PPI) to form larger macromolecular assemblies or complexes (sometimes known as molecular machines). These complexes are often linked together by extended networks of weaker, transient PPI, to form extended networks or neighbourhoods that integrate pathways mediating the major cellular processes, such as the control of gene expression, synthesis and degradation of biomolecules, cell propagation and the maintenance of genome integrity. As a consequence, the cell is increasingly viewed as an assembly of interconnected functional modules the interacto which integrates and coordinates the cells biochemical activities, behavior and responses to external and intrinsic signals.

Given their broad significance, systematic experimental analyses of PPI networks have become a major experimental focus, particularly since the recent publication of large-scale interaction studies in the important model eukaryotes S. cerevisiae, C. elegans, and D. melanogaster. For the most part, high-throughput methods for measuring PPI based on protein over-expression, such as the yeast two-hybrid assay, suffer from high rates of false discovery. To this end, we have developed rigorous and effective high-throughput methods for systematic large-scale affinity purification and characterization of endogenous protein complexes, and by inference networks of PPI, from E. coli. Our preliminary proteomic studies of ~1000 tagged and purified gene products (~1/4 of the genome) have provided insight into the functions of previously uncharacterized proteins and the overall topology of PPI networks that link a subset of microbial protein complexes [1]. While the core components of these PPI networks are broadly conserved, our initial analyses have uncovered evidence of significant functional diversification in cross-species projections. This initial study reinforces the utility of combining proteomics and comparative genomics to define the molecular architecture of biochemical systems from an evolutionary perspective.

A view of the predicted functional interaction network

Funded by the Canadian institute of Health Research we are currently extending this work by undertaking a complete genome-scale analysis of PPI, to identify the entire collection of protein complexes in E. coli, and to examine the conservation of interactions across evolution. In addition to experimentally determining sets of PPI's, we are also adopting modern informatics methods to predict linkages between genes to construct a network of 'functional interactions'. These are being used to define functional modules (e.g. pathways) and help extend interaction neighborhoods. Our aim is to integrate these datasets together with previously constructed knowledgebases on E. coli [2] and other types of metadata such as evolutionary profiles, to construct definitive, reliable and biologically-relevant datasets that inform on the molecular basis of biochemical systems within bacteria.

It is our hope that the provision of this resource to the microbiology, computation and structural biology research communities will help drive further research into the biochemical mechanisms underlying bacterial cell proliferation and homeostasis, further our understanding of the molecular basis of evolutionary adaptation, including colonization of a human host, and identify novel drug targets. We plan to develop and integrate new and exisiting tools to facilitate browsing of the data and would therefore welcome any feedback that you may have on the database and website. We will also endaevour to make the datasets freely available for download as they become available.

[1] Butland G, Peregrin-Alvarez JM, Li J, Yang W, Yang X, Canadien V, Starostine A, Richards D, Beattie B, Krogan N, Davey M, Parkinson J, Greenblatt J, Emili A. (2005) Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature. 433:531-7.
[2] Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett G 3rd, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D, Wanner BL. (2006) Escherichia coli K-12: a cooperatively developed annotation snapshot--2005. Nucleic Acids Res. 34:1-9.