Exploring Networks at the genome scale
Systems biology is aimed at achieving a holistic understanding of living organisms, while synthetic biology seeks to design and construct new living organisms with targeted functionalities. Genome sequencing and the fields of ‘omics’ technology have proven a goldmine of information for scientists when it comes to investigating the entire functionality of organisms. Because the genome encodes for all the machinery an organism possesses to survive in the different environments it might populate, it’s possible to understand an organism’s survival strategies by studying its genetic code. More than 350 bacterial genomes have been sequenced to date. But even though bacteria are relatively simple, every organism is so complex that it’s necessary to apply modeling to integrate knowledge at all levels – whether involving metabolism, signaling processes, transcription regulation, or a host of other
areas. Backed up by the accumulated knowledge of biological systems gathered over the last two decades, it is gradually becoming feasible to predict changes and artificially introduce desirable properties into new, genetically modified organisms. In this article, we focus on genome-scale transcription and metabolic network studies as well as the role of modelling, which assists micro-organism engineering and provides a platform for generating hypotheses and testing new proposals. Knowledge and predictions gained from systems biology are also paving the way for the rational design of subunits  or the entire genome  in organisms, to produce novel functions in line with the new paradigm proposed by synthetic biology.
Transcription network inference – which reverse engineers the transcriptional control of an organism from high-throughput gene expression data – offers valuable insights into the induction/inhibition relationship among genes and their regulating factors. Gene regulation can potentially proceed through many mechanisms, including post-translational modifications of transcription factors, and it can also involve many factors at the metabolic, enzymatic, and extra-cellular signaling levels that are not readily measurable at the genome-scale level. The average working hypothesis when making deductions about transcription networks therefore generally assumes that only expression levels of transcription factors significantly contribute to gene regulation. There are three common approaches to inferring transcription networks . Logical network inference methods represent genes as Boolean On/Off states connected by binary interactions, and then proceed to explore the available topologies. Alternatively, a statistical approach can be used to evaluate the mutual dependence between each gene pair in all measured conditions in order to infer the connectivity of the genes. Based on differential equations, the third approach captures the dynamic evolution of gene transcripts and their regulating factors. It can offer a more detailed analysis of the network, but due to its high computational cost, this approach is limited to small groups of genes. The large number of genes in a typical organism causes a high ratio of unknown variables versus data points; nevertheless, the problems posed by an under-determined system can be alleviated with additional information from verified interactions – for example, from DNA sequence binding motifs or data from comparative genomics. More efforts need to be made to incorporate non-transcription factors in the transcription network analysis, and to improve the accuracy of the inferred connections.
A metabolic network is a dynamic system of reactions that responds to the availability of nutrient sources and the surrounding conditions of an organism. Until now, the determination of reaction kinetics at the genome-scale level remains an overwhelmingly demanding task in terms of experiments. For this reason, the pseudo-steady-state assumption for example – which states that the concentration of each metabolite is constant over time – is generally used to simplify the relationship among metabolites to a linear stoichiometric model, thereby abolishing the need to use unknown reaction kinetics. Such models are constrained by biological limits such as reaction reversibility, maximum possible enzyme activity, and gene activity (e.g. determined from transcriptomics data). There are generally two different approaches to tackling linear stoichiometric genome-scale models: optimisation-based and unbiased analysis. The optimisation-based approach assumes the metabolic flux distributions at steady-state are regulated so as to achieve certain objectives. For example, Flux Balance Analysis calculates the optimal metabolic flux distribution among all possible sets of solution for a specific objective such as maximal growth yield or compound yield. It can also assess the viability of mutants  by evaluating the feasibility of biomass synthesis. Sometimes the assumption of perfect optimality in biological systems may not be true, and thus Flux Variability Analysis may be used to calculate sub-optimal solutions to give a range of possible flux distributions. Selecting the optimal mathematical function to represent the biological objective to optimize (cell growth, biomass or energy production) remains an open problem. When addressing the subject of mutant selection for example, the decision of gene-knockout could be assisted by the OptKnock method ,which identifies knockout targets for bioproduction by maximization of both cell growth and the yield of the metabolite of interest. A schematic diagram of gene-knockout selection based on genome-scale metabolic modeling is shown in Figure 1. The changes in flux distribution between the wild-type and mutants can be predicted by the Minimisation of Metabolic Adjustment or Regulatory On-Off Minimisation methods. The first of these assumes that mutants tend to minimise variations to metabolic fluxes . The second minimises the number of adjustments in the reaction network . Both methods take into account that cells may not function at the optimal state, and mutants may evolve slowly before arriving at the new ideal flux distribution. They are both useful, with one performing at times better than the other depending on the situation.
Topologies of networks
The unbiased analysis approach investigates the topology of the metabolic network to reveal its fundamental properties. For example, Elementary Flux Modes indicate all minimum feasible sets of metabolic pathways to maintain steady-state. The Extreme Pathways method evaluates the network in a similar way, but treats internal reversible reactions differently by splitting them into two separate reactions, resulting in different practical aspects in real metabolic systems . These two methods, however, can become intractable in computational terms for genome-scale networks. Alternatively, the qualitative relationship among fluxes – how changes in one flux would affect another flux in a large network – can be computed by the Flux Coupling Finder method. This assigns relationships to reactions based on the extent of their mutual influence. Genome-scale metabolic modeling can be used to assist over-production of compounds of industrial or clinical interest, to understand the metabolic properties of certain pathogenic organisms , or to engineer cells to modify their functions.
Prediction of genetic regulation
Coupling the models for regulatory networks and metabolic networks to predict the effect of genetic regulation on metabolic fluxes is still an open challenge. In order to extend the capability of genome-scale metabolic models to describe transient dynamics, several approaches have been adopted to compromise the lack of detailed quantitative reaction kinetics from an entire organism. For example, reactions in a signaling network can be assumed to be fast, and thus simplified using quasi-steady-state. And slow reactions like biomass synthesis can be approximated in a time-delayed manner . Another suggestion is to use a linear sum of logarithmic terms based on stoichiometric relation to approximate unknown enzyme kinetics . It is important for models to capture dynamic responses in biological systems, since these are known to be of a highly non-linear nature. The efficient engineering of cell hosts (‘chassis’ ), requires synergy from both experimentation and a predictive genome-scale model that integrates the knowledge of all cellular activities.
In efforts to increase the predictive capability of large transcription and metabolic network models, many challenges remain to be overcome, among them the quality of gene annotations and genes with unknown functions. Modeling and experimentation at the genome-scale are destined to grow closer in the future, enabling us to understand the biological world in ever greater detail, and synthesize more accurate biological devices. The ability to infer transcription networks correctly and predict metabolic responses is especially important for synthetic biology, where synthetic gene regulatory networks are built within cell hosts  or novel biocatalytic circuits are designed to reprogramme cellular metabolic patterns . A ‘blueprint’ of the interactions among genes, proteins, metabolites, and regulatory factors in a host cell from systems biology would form the basis for studying artificial manipulation of the cell in synthetic biology. Together, the two fields will continue to expand our knowledge of the properties of living organisms, and open new avenues for biotechnological development.
 Agapakis, C.M., Silver, P.A., Agapakis, C.M., Silver, P.A, Synthetic biology: exploring and exploiting genetic modularity through the design of novel biological networks. Molecular BioSystems (2009), 704-713.
 Carrera, J., Rodrigo, G., Jaramillo, A., Towards the automated engineering of a synthetic genome. Molecular BioSystems (2009), 733-743.
 Bansal, M., Belcastro, V., Ambesi-Impiombato, A., Bernardo, D.D., How to infer gene networks from expression profiles. Mol. Syst. Biol. (2007), 78.
 Oh, Y.K., Palsson, B.O., Park, S.M., Schilling, C.H., Mahadevan, R., Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J. Biol. Chem. (2007), 28791-28799.
 Burgard, A.P., Pharkya, P., Maranas, C.D., Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. (2003), 647-657.
 Segre, D., Vitkup, D., Church, G.M., Analysis of optimality in natural and perturbed metabolic networks, PNAS (2002), 15112-15117.
 Shlomi, T., Berkman, O., Ruppin, E., Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc. Natl. Acad. Sci. U. S. A. (2005), 7695-7700.
 Papin, J.A., Stelling, J., Price, N.D., Klamt, S., Schuster, S., Palsson, B.O., Hierarchical thinking in network biology: the unbiased modularization of biochemical networks. Trends Biotechnol. (2004), 400-405.
 Oberhardt, M.A., Puchalka, J., Fryer, K.E., Martins dos Santos, V.A.P., Papin, J.A., Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1 J. Bacteriol. (2008), 2790-2803.
 Lee, J.M., Gianchandani, E.P., Eddy, J.A., Papin, J.A., Dynamic analysis of integrated signaling, metabolic, and regulatory networks. PLoS Comput Biol (2008), e1000086.
 Smallbone, K., Simeonidis, E., Broomhead, D.S., Kell, D.B., Something from nothing: bridging the gap between constraint-based and kinetic modelling. FEBS J. (2007), 5576-5585.
 Andrianantoandro, E., Basu, S., Karig, D.K., Weiss, R., Synthetic biology: new engineering rules for an emerging discipline. Mol. Syst. Biol. (2006), 2006.0028.
 Cantone, I., Marucci, L., Iorio, F., Ricci, M.A., Belcastro, V., Bansal, M., Santini, S., di Bernardo, M., di Bernardo, D., Cosma, M.P., A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell (2009), 172-181.
 Landrain, T.E., Carrera, J., Kirov, B., Rodrigo, G., Jaramillo, A., Modular model-based design for heterologous bioproduction in bacteria. Curr. Opin. Biotechnol. (2009), 272-279.
Prof. Dr. Dipl-Ing Vitor A.P. Martins dos Santos
Systems and Synthetic Biology Group
Helmholtz Center for Infection Research
Inhoffenstr. 7, 38124 Braunschweig, Germany
Tel./ Fax: +49-531-6181-4008/-4199
from 03/2010: firstname.lastname@example.org