## Abstract

Biological networks are widely reported to be robust to both external and internal perturbations. However, the exact mechanisms and design principles that enable robustness are not yet fully understood. Here we investigated dynamic and structural robustness in biological networks with regards to phenotypic distribution and plasticity. We use two different approaches to simulate these networks: a computationally inexpensive, parameter-independent continuous model, and an ODE-based parameter-agnostic framework (RACIPE), both of which yield similar phenotypic distributions. Using perturbations to network topology and by varying network parameters, we show that multistable biological networks are structurally and dynamically more robust as compared to their randomized counterparts. These features of robustness are governed by an interplay of positive and negative feedback loops embedded in these networks. Using a combination of the number of negative and positive feedback loops weighted by their lengths and sign, we identified a metric that can explain the structural and dynamical robustness of these networks. This metric enabled us to compare networks across multiple sizes, and the network principles thus obtained can be used to identify fragilities in large networks without simulating their dynamics. Our analysis highlights a network topology based approach to quantify robustness in multistable biological networks.

## Introduction

Robustness is an inherent property of many biological systems and a fundamental feature of evolvability [1, 2]. Robust systems can maintain their functions or traits despite a dynamically changing environment, and thus possess enhanced fitness[3]. Trait robustness is pervasive in biology throughout at many organizational levels including protein folding, gene expression, metabolic flux, physiological homeostasis, development, organism survival, species persistence, and ecological resilience [4, 5, 6]. At the cellular level, only a limited number of specific internal/environmental changes lead to a change in the otherwise robust cell-fate. Waddington [7] postulated that mechanisms have evolved that stabilize a phenotype against both genetic and environmental perturbation. He called this process “canalization” and contrasted it with developmental “flexibility” in which a phenotype changes, adaptively, with a change in environment. Robustness plays a central role in the biochemical processes underlying many important biological systems. Consider for example the life cycle of the *λ*-phage, a virus that infects bacterial cells. Once the virus enters the interior of a host cell, it goes into one of two possible modes: the lytic cycle or the lysogenic cycle. The set of interconnected biochemical processes that determines which of the cycles a phage chooses is now well understood, and it has been shown experimentally [8] that the decision-making is robust against point mutations in the individual genes involved. Thus, understanding the mechanisms underlying robustness is of fundamental importance.

The study of robustness is also an integral part of various engineering and socio-ecological systems.[9, 10]. Designing robust machines involves integrating specific interactions (feedbacks for example) between the individual components, forming intricate networks. Similarly, biological systems such as cells or organisms also have underlying networks, with interactions that have evolved over millions of years that can impart emergent properties such as robustness to the biological system. Examples of these networks include metabolic networks, protein interaction networks, gene regulatory networks (GRNs), etc [11, 12, 13, 14]. The complexity of the interactions in these networks leads to emergent properties, which manifest as traits of the biological system. In such networks robustness can be studied in two ways, as classified by the nature of perturbations made to the network. Structural robustness is the study of robustness of biological traits to changes in the underlying regulatory network topology such as edge deletions, caused by strong perturbations such as genetic mutations [15]. Dynamical robustness, on the other hand, is the study of robustness of traits to changes in the parameters governing the dynamics of the components or nodes of the regulatory networks (production and degradation rates, link strengths etc.) [16].

Depending upon the system of study and the question of interest, the properties for which robustness is studied might differ. Two such properties of interest in multistable cell-fate decision making networks are phenotypic distribution and plasticity. Phenotypic distribution refers to the collection of phenotypes (gene expression patterns for a transcriptional gene regulatory network, for example) that the network can exhibit under different conditions or parameters. Plasticity on the other hand, refers to the ability of the network to adapt to a changing environment by modifying its phenotype. The concepts of phenotypic plasticity and heterogeneity (distribution) are well represented by Waddington’s epigenetic landscape [7] in which valleys denote gene expression patterns (phenotype) that can switch back and forth under the influence of noise or other external perturbations **(Fig 1a)**.

To understand robustness, a variety of mathematical models have been used. For example, building on the concept of canalization by Waddington, Kauffman introduced a version of this concept to Boolean network modeling of gene regulatory networks by studying canalizing functions [17]. In the past decade, the mathematical frameworks for characterizing the robustness of networks have been studied in various settings, such as directed networks, embedded networks and many more [12, 18, 19]. Robustness has been widely investigated in linear control systems [20, 21] but it remains very difficult to measure the robustness of complex non-linear systems [22]. More recently, investigations of the connections between network topology and motifs with the robust behaviour of biological systems have been carried out. Jeong et al. [23] found significant correlations between the centrality of a protein and the effect of its deletion on the robustness of the emergent phenotypes. Further studies have shown the evolution of robust network motifs in cellular locations that face strong environmental noise [24]. However such investigations are limited in number as well as in the scope of their generality. Hence, a comprehensive and scalable understanding of the role of network topology on robustness is required.

Here, we demonstrate the structural and dynamic robustness of biological networks, using the gene regulatory networks (GRNs) underlying Epithelial Mesenchymal Plasticity (EMP) **(Fig 1b)** as case studies. By using a parameter agnostic, ODE based formalism: RACIPE, and a parameter independent continuous framework, we investigate the characteristics of robustness and identify unifying trends that underlie each aspect of it. Particularly, we find that the number of positive and negative feedback loops in the system govern the extent of robustness in phenotypic distribution and plasticity. We also use this framework to predict which links in a network can be targeted and modified to inhibit plasticity – which may provide therapeutic insight in the clinic, for processes such as EMP. Our results thus provide an integrated understanding of the design principles that enable robustness across multiple frameworks.

## Results

### Biological networks are dynamically robust

Here, we considered various GRNs underlying EMP **(Fig 1b, S1a)**[25]. Each network has a set of activating (pointed arrows) and inhibiting (hammer-head arrows) edges/links connecting the nodes of the network. Emergent dynamics of these interactions has been shown to give rise to multistability (i.e., coexistence of multiple phenotypes)[25]. These activation/inhibition interactions are modelled using an ODE based formalism, with the strength of these interactions affecting the basal production rates of the target nodes. We perturb these networks and measure the change in plasticity as a fold change, and the change in the phenotypic distribution using an information theoretic metric known as Jensen Shannon Divergence (JSD) [26] (see Supplementary Methods and **Fig 1c**). If a network displays similar dynamic behaviour even upon perturbation of the network topology, it is said to be structurally robust. These perturbations include edge additions, deletions, and nature changes from activation to inhibition and vice-versa **(Fig 1d)**. Similarly, if the behaviour of the network roughly remains the same under variation of the kinetic parameters governing each interaction, such as the production and degradation rates, regulatory interaction parameters between transcription factors and promoters, etc., it is said to be dynamically robust **(Fig 1e)**.

We simulate each network using a ODE-based framework called RACIPE [27] (see Supplementary Methods). For a given GRN, the framework first constructs a set of coupled ordinary differential equations. Then, multiple sets of ODE parameters are generated from pre-determined ranges (the parameter space), which are then used to simulate interactions between nodes of the network. These simulations can be used to identify the robust dynamical features of a particular network. Because RACIPE randomly samples parameters for each simulation, it can be thought of as a parameter agnostic framework [27]. Using this framework, we can analyze the dynamic robustness of a network by studying the variation in network behavior with changes in RACIPE parameter space (dynamic perturbation). The behavior of a network is measured using two biologically relevant emergent properties of the network: plasticity and phenotypic distribution. Some of the parameter sets generated by RACIPE might lead to convergence to distinct stable states when initialised from random initial conditions **(Fig 1a)**. Plasticity score of the network is then defined as the fraction of such random parameter sets that can give rise to multiple stable states[25]. The phenotypic distribution is found using the frequency distribution of the discretized steady states of the network (see Supplementary Methods for discretization procedure) over the RACIPE parameter space. The change in network plasticity upon perturbation is measured simply as the fold change in plasticity score. To quantify the change in phenotypic distributions, we computed the JSD between them, which varies from 0 to 1 **(Fig 1c)**. The more two distributions differ, the higher the value of JSD, giving us a quantitative understanding of the dissimilarity between them.

Because the network behavior in RACIPE is regulated by both network topology and various kinetic parameters, plasticity and phenotypic distribution can vary across distinct parameter ensembles. In order to characterize this variation with across the parameter space, we simulated these biological networks using RACIPE, by varying the maximum ranges for different network parameters via multiplying them by a factor ranging from 0.33 to 3. While multiplication factor *>* 1 leads to the expansion of the parameter space, multiplication factor *<* 1 shrinks the parameter space. As the Hill coefficient in these systems contributes to the inherent non-linearity and hence multistability [28], we sought to understand if the Hill coefficient alone could explain most of the observed change. Hence, we varied the default RACIPE parameters in 3 different ways: all parameters (Hill coefficient, production rate, degradation rate and fold change) being varied, everything but Hill coefficient, and only the Hill coefficient. We then calculated the JSD of the resultant phenotypic distribution from that of the unmodified parameter space and plotted them against the multiplication factors **(Fig 2a, S2a,b)** for all EMP networks. As the multiplication factor got farther from 1, the JSD increased. The proportional increase was higher when the parameter space shrunk, as opposed to when the parameter space was expanded. However, the JSD values in all cases was low (maximum value = 0.16), hinting at dynamic robustness in these networks. Interestingly, the average JSD across the multiplication factors, observed from varying all parameters was comparable to that obtained from varying only the Hill coefficient **(Fig 2a, inset)**. Similarly, we plotted the plasticity of the perturbed parameter space against the corresponding multiplication factor **(Fig 2b, S2c,d)**. As the parameter space shrunk, the plasticity of the network decreased as well. Similar to JSD, the average dynamic fold change in plasticity also was affected mainly by the variation in Hill coefficient **(Fig 2b, inset)**, suggesting that the Hill coefficient (non-linearity) is a crucial parameter in determining both of these network properties.

Next, we asked whether this dynamical robustness is specific to these biological networks. To answer this, we generated multiple random (hypothetical) networks having the same number of nodes as the biological networks, but randomly placing activating/inhibiting edges to connect the nodes (see Methods). We then measured the dynamical robustness of these networks in the same manner as the biological networks, i.e., using the measures of average fold change in plasticity (see Supplementary Methods) and JSD upon varying multiplication factor. The distribution of these values, when plotted, revealed that biological networks had much lower JSD, and an average fold change in plasticity much closer to 1 than most of the random networks **(Fig 2c,d, S2e,f)**. Thus, the biological networks are more dynamically robust than their random counterparts.

Another way to understand the variation due to kinetic parameters is to compare the network behavior in RACIPE simulations to that of a parameter independent framework, such as the Boolean (logical) formalism [29]. The dynamics of Boolean models are governed purely by the topology of the network. The gene activities are discrete (either ON or OFF), which gives us a more qualitative understanding of how the nodes in the network influence each other to give rise to varying network dynamics. Therefore, the Boolean formalism serves as a non-parametric contrast to the RACIPE formalism. One drawback however is that with the Boolean formalism, we cannot define plasticity the way we define it for RACIPE, therefore, robustness can only be studied in phenotypic distribution.

We applied a threshold-based Boolean formalism [29] to simulate these networks (see Methods). While the top states obtained from both formalisms are the same, there is still significant dissimilarity between the steady state distributions obtained in both frameworks **(Fig 2e, S2g,h)**. We then asked if the dissimilarity between Boolean and RACIPE dynamics can be reduced in any manner. One possibility was to impart a discrete behavior to RACIPE, as RACIPE uses ODEs with regulatory interactions modelled by shifted hill functions. With increasing hill coefficient, the shifted hill function starts to resemble a switching function [30], reminiscent of Boolean functions. Hence, we increased the hill coefficient gradually and measured the JSD between RACIPE and Boolean phenotypic distributions **(Fig 2f)** for different wildtype networks. We see a small drop in JSD with initial increase in hill coefficient, followed by a saturating behavior in JSD upon further increase in Hill coefficient, indicating that additional factors control the difference between RACIPE and Boolean phenotypic distributions.

### A continuous, parameter independent formalism for efficient simulation of GRNs

The existing Boolean formalism differs from RACIPE in two aspects: the lack of kinetic parameters as compared to RACIPE and a discrete state space instead of a continuous one. Because modifying RACIPE parameters did not work well in bringing Boolean and RACIPE distributions sufficiently closer (**Fig 2f**), we decided to move the Boolean model closer to RACIPE by making the state space of Boolean formalism continuous **(Fig 3a)**. The details of the implementation of this formalism, termed Continuous formalism, are given in Methods section. By doing this, we can now measure the dynamic robustness of the networks, because the essential difference between the Boolean formalism and RACIPE left is the lack of kinetic parameters in the former. While this formalism is parameter independent, the steady state phenotypic distributions obtained in this model are much closer to that obtained via RACIPE (lower JSD), relative to the earlier Boolean model, for all biological networks **(Fig 3b, S3b,c)**. Most random (hypothetical) networks showed this increased similarity as well **(Fig S3a)** suggesting that this framework can be applied to a larger set of networks. Moreover, this model is computationally inexpensive compared to RACIPE as measured for simulations across varying network sizes **(Fig 3c)**. We also investigated the sensitivity of the results of both frameworks to the number of simulations (see Supplementary Methods). We see that the results obtained in the continuous model are more consistent for the same number of simulations when compared to RACIPE, further highlighting its benefits as an alternative to ODE based methods **(Fig 3d)**.

We then sought to understand if the JSD due to structural perturbations, i.e. either a single edge deletion, insertion, or a nature change **(Fig 1d)**, in the Continuous model correlated well with the same perturbation done in RACIPE. We calculated the JSD of the perturbed network’s phenotypic distribution from that of wildtype networks, using RACIPE, Boolean and Continuous fomalisms. We found that the correlation between RACIPE and Continuous JSD values, as well as the residual obtained via linear regression, are much better than that between RACIPE and Boolean **(Fig 3e-g, S3d-i)** across the biological networks examined, further providing support to the similarity of the continuous model to RACIPE. We then simulated random networks using the continuous formalism to check if the dynamical robustness properties are captured by this formalism. The JSD between RACIPE and the continuous formalism for the phenotypic distributions of biological networks was lower than that of most random networks **(Fig 4a, S4a,c)**. Consequently, the continuous model proved to be a framework that is both computationally efficient and correlates well with RACIPE perturbations of the network topology.

### Structural robustness of multistable biological networks

We further investigated the robustness of plasticity and phenotypic distribution to structural perturbations, i.e., addition, deletion and nature change of network edges. For robustness in phenotypic distribution, we simulated the random networks and their perturbations using the continuous model formalism. We then calculated the JSD of each perturbed network from its corresponding unperturbed network and took an average of these JSD values to represent the structural robustness of the network. The perturbation JSD for a given network is a distribution, consisiting of values obtained from all perturbations of the network (**Fig 4b, inset**). The distribution of the average perturbation JSD for random networks revealed that the biological networks have a lower average perturbation JSD and therefore higher structural robustness in phenotypic distribution than most random networks **(Fig 4b, S4b**,**d)**. However, single edge perturbations may not fully capture the robustness landscape, as it is also possible to modify multiple edges at the same time. Consequently we looked at multiple random edge perturbations in different biological networks **(S4e**,**f)**. If the graph had E number of edges, we performed up to E number of random edge perturbations to see how the average JSD varies with perturbation size. We observed that the average perturbation JSD for a single edge was about half of the average obtained in E many perturbations, indicating that analysing the effects due to a single perturbation is sufficient to understand the phenotypic robustness of a network to edge modifications. Hence, the Continuous model is a useful tool to investigate the structural robustness of a network, as the average value of the perturbation JSDs such obtained are a good indicator of the global network robustness in phenotypic distribution.

For the analysis of structural robustness in plasticity, we simulated the random networks and the corresponding perturbations using RACIPE, because plasticity is defined only for RACIPE. Similar to JSD, we calculated the average fold change in plasticity for each random network and obtained a distribution of the same. We found that the average fold change in plasticity, obtained from a distribution of the fold change in plasticity upon all topological perturbations (**Fig 4c, inset**) of WT networks is closer to 1 than that of most random networks **(Fig 4c, S4g)**, indicating that the biological networks are structurally more robust than their random network counterparts.

### Feedback loops underlie the robustness in biological networks

After characterizing structural robustness in biological networks, we attempted to understand why biological networks show structural robustness. Positive feedback loops have been reported to play a major role in the stability of biological networks by reinforcing the network dynamics [31, 32, 33]. The reinforcement provided by positive feedback loops can lead to convergence to approximately similar steady states across a wide range of parameters in RACIPE. Moreover, positive feedback loops have been shown to play a crucial role in governing plasticity of the networks as well [25]. Similarly, negative feedback loops can contribute to oscillations [31, 33, 32]. Therefore, we hypothesized that interplay between positive and negative feedback loops can govern the robustness of GRNs.

To understand if the positive and negative feedback loops play a role in imparting structural and dynamical robustness to networks, we decided to use the ensemble of random networks created, because they sample the spectra of robustness well. First, we divided the distributions of average perturbation JSD **(Fig 4b)** and average fold change in plasticity **(Fig 4c)** about their respective medians, and asked if there was a significant difference in the distribution of feedback loops for the networks with high robustness vs those with low robustness. The networks with the lower average perturbation JSD had a significantly higher number of positive feedback loops (PFLs) and a lower number of negative feedback loops (NFLs), as quantified by the difference in the means of the distributions **(Fig 4d)**. Interestingly, the shift in the distribution of negative feedback loops is more distinct when compared to that of positive feedback loops. Similarly the networks with a higher average fold change in plasticity have a smaller number of negative feedback loops and a larger number of positive feedback loops **(Fig 4e)**. Unlike with JSD, the two groups of networks in case of plasticity are better differentiated by positive feedback loops than with negative feedback loops. These results suggest that positive and negative feedback loops might have varying degree of importance in governing different kinds of robustness. A common feature for robustness in both plasticity and JSD seems to be, however, that the networks with higher robustness have a larger number of positive feedback loops and a smaller number of negative feedback loops.

To better understand the relative contribution of positive and negative feedback loops, we went back to biological networks and measured the Spearmann correlation between plasticity of the perturbed networks and the corresponding change in number of positive and negative feedback loops (**Fig 4f,g**, each dot is a perturbed network). As expected, the number of positive feedback loops has a stronger (∼ 2 fold in magnitude) correlation with the fold change in plasticity in comparison to the number of negative feedback loops. This observation suggests that positive and negative feedback loops have opposite but unequal effect on network plasticity. We then investigated if a weighted combination of positive and negative feedback loops can help explain plasticity better. For a given network, we identified the optimal weights for positive and negative feedback loops that would maximize the correlation between the combined loops metric and plasticity. The larger the value of the weight, higher is the contribution of the corresponding loops in explaining plasticity (see methods section). Because PFLs correlate positively with robustness in plasticity and NFLs correlate negatively, we assigned a negative weight to NFLs (robustness should decrease as number increases) and a positive weight to PFLs (robustness should increase as number increases). The weight distribution obtained from the ensemble of random networks had a median of around 3 : −1 **(Fig 4h)**, i.e, positive cycles were given more weight in explaining plasticity. As expected, we find that the weighted feedback loops (SWFL = weighted sum of PFLs and NFLs, see Methods) has a better correlation with the plasticity of perturbed networks, as compared to just the PFLs **(Fig S5a)**.

Previous literature also suggests that not all PFLs play the same role in regulating the plasticity of the network[34, 35]. Consequently, we decided to penalize each PFL using its length, i.e., giving a higher importance to shorter feedback loops (see Methods), with the hypothesis that breaking a longer feedback loop would have a more diluted effect on the system. We then took a sum over these penalized PFLs, creating a new metric (WPFL). To verify that this penalization is an improvement on the results obtained without penalization, we compared the correlations obtained via the two metrics (Δ*P F L* and Δ*W P F L*) with the plasticity of the network obtained via structural perturbation for the ensemble of random networks **(Fig 5a)**. The correlation using WPFLs is an improvement on PFLs (most dots are above the *x* = *y* line), indicating that it is important to consider the length of the feedback loops as a factor influencing their impact.

We then decided to check if feedback loops correlate with the other aspects of robustness as well. We found that the average perturbation JSD correlates positively with the NFLs **(Fig 5b)** and negatively with PFLs (**Fig S5b**) for all random networks of size 4, and the JSD between RACIPE and Continuous also correlates positively with NFLs **(Fig 5c)**. These trends reiterate that structural robustness correlates negatively with the NFLs and positively with PFLs. These trends are characteristic of the sign of the feedback loops, as the correlation of structural robustness with total number of feedback loops (PFL+NFL) was weak and not significant (**Fig S5c**).

However, PFLs make for a poor metric for predicting the robustness of a network. As there are no clear bounds on the number and no direct dependence on the network size, the correlation between PFLs and robustness does not hold when networks of multiple sizes are compared **(Fig S5d)**. One alternative is hence to consider the fraction of instead, as it is bounded between 0 and 1 irrespective of the network size. We also know that the length and sign of the feedback loop are important factors to be considered. Thus, we decided to look at the fraction of weighted positive feedback loops (FWPC) instead, where positive and negative feedback loops are first weighted by length, and then weighted by sign so as to maximize the correlation with robustness (see Methods). Note, that these weights are calculated separately for each measure of robustness (**Fig 5d**). While the optimal weights are taken as the ones that maximize the absolute value of the correlation, in most cases, the decrease in the correlation for weights greater than the optimum weight is negligible (**Fig S6a-d**). The optimum weights for each robustness measure-network size combination is given in **Table 1**. Upon comparing the correlation coefficients for average Pertubation JSD vs the different metrics mentioned above (PFLs, NFLs, FPC, FWPC) for different network sizes (4-10, **Fig 5e**), we find that the correlation against PFLs and NFLs drops as the network sizes increase, especially when networks of different sizes are considered simultaneously (**Fig 5e**, labelled “ALL” on x-axis). On the other hand, the fraction of weighted positive cycles consistently has a higher correlation as compared to the other metrics. On plotting the average perturbation JSD for random networks of various sizes vs FWPC **(Fig 5f,g)**, we see that the correlation is maintained even across multiple network sizes, giving us a scale-independent understanding of the structural robustness of networks.

FWPC also correlated positively with structural and dynamic robustness in plasticity, and dynamic robustness in distribution as measured by JSD between Continuous and RACIPE **(Fig 5h, S5e,f)**. Furthermore, we found that the random networks for which the JSD between RACIPE and the Continuous model was higher compared to Boolean were characterised by having a smaller fraction of positive feedback loops (FPC) (**Fig S5g**), supporting the connection between the fraction of positive feedback loops and degree of dynamic robustness. It is, however, interesting to note that while the biological networks we studied had a high dynamic robustness in distribution as measured by the JSD from RACIPE parameter variation as well, this robustness did not correlate strongly with the fraction of weighted positive cycles **(Fig S5h)**.

The correlation coefficients obtained using the metrics above for different robustness criteria, combining networks of all sizes is shown as a heatmap **(Fig 5i)**. Note the higher correlation when using the FWPC, as opposed to the NFLs, PFLs, or the FPC in these networks. Based on these results, we can conclude that the fraction of weighted positive feedback loops serves as a good metric to measure both structural and dynamic robustness in networks. Because the calculation of this metric does not require any simulations, these metrics have extreme computational value for determining the robustness of large-scale biological networks.

### Larger regulatory networks follow similar design principles of robustness

To test the scalability of our results, we analysed 2 large EMP networks: EMT RACIPE (22 nodes, 82 edges)[36] and EMT RACIPE2 (26 node, 100 edges)[37] **(Fig 6a,b)**. We first found the phenotypic distributions of these networks in all 3 models, RACIPE, Boolean and Continuous model **(S7a-d)**. As it was with the small networks, the Continuous model showed a better agreement with RACIPE than Boolean. We also found the distribution of multistability (parameter sets giving rise to n steady states, n = 1,2,3…) in RACIPE for these networks **(Fig 6c,d)**. We observed that these networks have very high plasticity, with only upto 5% of parameter sets displaying monostability, the rest being multistable. We then perturbed these networks structurally (single edge deletions/ nature changes) to study their robustness in plasticity. Both of these networks have a large number of positive feedback loops (*>* 3000, *>* 10000 respectively), and most perturbations only caused minor change in the plasticity score, thereby demonstrating the structural robustness of these networks. We found the plasticity scores for each of the perturbed networks as well as the WT. As observed in the networks of smaller scale, the correlation between the plasticity of the perturbed network, and the corresponding change in the number of positive feedback loops was positive and significant **(Fig 6e,f)**.

However, in both networks there were two perturbations (**Fig 6e,f;** marked with red arrows) that reduced the number of positive feedback loops equally drastically, yet have quite distinct fold change in plasticity. Consequently, it is clear that just the change in the number of positive feedback loops alone does not provide a complete picture of how the plasticity of a network changes upon perturbations. As our results show that negative feedback loops also matter in determining the plasticity, we also coloured each perturbation on the plot by the number of negative feedback loops. It was then seen that among the two perturbations, the one with a lower fold change in plasticity had a higher number of negative feedback loops, confirming that negative feedback loops play an important role in regulating the plasticity of a network. We then plotted the fold change in plasticity versus the change in Weighted feedback loops (weighted according to length and sign), and found that the metric showed stronger correlation with plasticity in comparision to the number of positive feedback loops **(Fig 6g,h)**. These results hence suggest that the weighted feedback loops metric can be used to understand the robustness (and fragility) of larger networks as well.

## Discussion

Robustness is a fundamental, ubiquitous feature of biological systems, that enables them to maintain their function in fluctuating environments against both dynamic and structural perturbations. While there is a large body of research that emphasizes the robustness of these systems, numerical simulation of the networks is often employed to understand the degree of variation in network dynamics. Such simulations are often hard to carry out, due to the lack of parametric information and computational expense. Consequently, it is important to identify metrics to mathematically model dynamic and structural robustness across a wide parameter space and topological perturbations. The dynamic properties of biochemical networks can be altered by mutations and/or environmental changes that can alter the interaction strength of edges or the rates of production or degradation of the nodes or even the way the nodes are connected to each other. Decoding what network motifs can protect network dynamics against such perturbations is key to understanding robustness.

Our results provide a theoretical toolset to identify robustness in both phenotypic distribution and plasticity in multistable biological network using a network-level approach. Taking examples from the EMP literature, we have simulated a number of biological networks along with an ensemble of their random network counterparts in order to identify the design principles that enable these networks to display robustness when subjected to dynamic or structural perturbation. We also pave the way towards understanding how one can perturb the fragile components of these networks to cause a change in the degree of multistability (plasticity). Our analysis shows that reducing the number of positive feedback loops and/or increasing the number of negative feedback loops in networks can alter their plasticity across a wide range of parameter sets. A recent study complements our conclusion by showing that disrupting the miR-200/ZEB positive feedback loop, one of the key driving factors of EMP, via CRISPR led to a significant drop in metastasis *in vivo* [38].

EMP is a phenomenon that has recently attracted enough attention from researchers modelling regulatory networks. While smaller regulatory networks are modelled using continuous ODE based frameworks [39, 40], larger networks are modelled using discrete/logical approaches due to lack of scalability of ODE approaches in scenarios where network parameters may not be easily obtained [29]. While either mechanism may reveal atleast the most frequent steady state, a comparative analysis for larger networks, becomes computationally quite expensive. Hence, we offer an alternative modelling framework, which is continuous in nature yet parameter independent. This framework is computationally more efficient, offers more detailed analysis compared to the Boolean formalism and can reveal the underlying design principles of a network topology even for larger networks.

The scalability of our results can also be seen by considering the 22 and 26 node EMP networks that were simulated via RACIPE, wherein we were able to identify edge deletions/ nature changes that significantly inhibit the plasticity of the EMP network. Our previous work [25] has indicated that the reducing the number of positive feedback loops can reduce the plasticity of a network, but also showed that although some perturbations decreased the number of positive feedback loops to a comparable extent, but the corresponding fold change in plasticity was significantly different. We hypothesized that this difference was due to them not taking into account the effect of negative feedback loops in the system, and were able to obtain a much stronger correlation using a weighted combination of negative and positive feedback loops in the system. As Boolean approaches fail to characterise the plasticity of large scale networks effectively, and ODE based methods such as RACIPE are too computationally expensive for large networks, such metrics provide a valuable tool to understand which links must be disrupted to curb plasticity, which could potentially be valuable in therapeutic applications, such as preventing metastasis by managing EMP.

Our analysis is applicable to understanding multiple aspects of multistable gene regulatory networks. For instance, we estimated the dynamic variability in phenotypic distribution and plasticity for all 6 biological networks, and found that they were relatively more dynamically robust compared to the random networks we had generated of the same size. Furthermore, our analysis suggested that most of the variation was captured by the change in the hill coefficient, despite there being 4 parameters in total, perhaps because the non-linearity of the system plays a major role in determining the multistable dynamics of the system [28]. Previous work also suggests that networks embedding nonlinear dynamics allow expression levels to be robust to small perturbations [41], from which we can infer that the hill coefficient plays a central role in determining the dynamic robustness in biological networks.

In terms of dynamical robustness, biochemical network that is involved in the establishment of segment polarity in Drosophila melanogaster has been shown to be robust against changes in initial values and rate constants of molecular interactions, enabling stable pattern formation [42, 43]. Similarly in a rewiring experiment for GRN of E. Coli, wherein new regulatory interactions were added to GRNs, [44] 95 % of these modified gene interaction networks were tolerated by the bacteria and some conferred selective advantages in particular environments. Our analysis of the structural robustness of biological networks reveals similar trends. We saw that the biological networks studied showed structural robustness in both plasticity and phenotypic distribution.

On further analysis of the topological properties that lend these networks dynamical and structural robustness, we found that robustness was associated with a larger number of positive feedback loops, and a smaller number of negative feedback loops. Moreover, the relative importance of each feedback loop was found to vary with both length and sign, across the different robustness measures. Multiple studies have modelled the evolution of biological networks, selecting for fitness [45], randomly generating GRNs using Markov Chain Monte Carlo [46] and using preferential attachment [31], finding that robust networks are rich in PFLs, and have relatively lower number of NFLs. Furthermore, the robustness offered due to these motifs can be evolutionarily stable [47] to edge perturbations. Consequently, our results show that the interplay between positive and negative feedback loops in the network enables robustness across multiple criteria.

Our methods only investigate robustness on the gene regulatory level, however it is important to understand how the interactions between genes, proteins and metabolites contribute to robustness in a multi-layer heterogenous biological network [12]. Moreover, merely counting the number of feedback loops in the network is insufficient to capture the overall network topology, as these feedback loops can be coupled to varying degrees. Consequently, more sophisticated measures need to be developed that take into account the location and degree of interaction between different feedback loops in order to more effectively understand which loops must be disrupted for maximum effect. Other topological factors such as high interconnectivity, redundancy, degree distribution can also influence robustness [48, 49]. Despite these limitations, our results provide an integrated platform to understand the structural and dynamic robustness of these systems across multiple metrics. Understanding the interplay between negative and positive feedback loops that lend these networks robustness, as well the mutations that can expose the fragility of these networks is crucial for drug design from the viewpoint of improving the robustness of biochemical pathways or networks or to exploit them therapeutically.

## Methods

### Continuous Model

In the new continuous model, the expression values of the nodes in a GRN are continuous real values restricted between [−1, 1]. The nodes are updated synchronously, and in each iteration, all nodes are updated using the updation rule:

Where is the expression of node *i* at time *t*, is the activation of node *i* at time *t*, defined as:

*J*_{ji} is the sign of the edge from node *j* to node *i*. Upper and lower bounds of −1 and 1 are enforced on the expression levels during the simulations. A state at time *t* is classified as stable or unstable based on the following condition:
where *D*_{i} is the indegree of node *i*. The condition can be described as follows: if the activation of any node has an opposite sign as the expression level of the node (thereby pushing the node towards the opposite sign), and if the activation is large enough for the push to be of any consequence, then the state is unstable. The lower bound for the activation threshold is of the order of the standard deviation of activation of a node, if input nodes were randomly initialised. The networks are simulated until a stable state is reached. The stable state values are then binarised as 1 for positive stable state, and 0 otherwise.

### Calculating the number of Feedback Loops

The number of feedback loops in a given network were calculated using the NetworkX package[50]. We first count the number of cycles in the network, and traverse through each cycle, multiplying the edge signs up (1 for activation and -1 for inhibition). The cycle is classified as a positive feedback loop if the product is positive, and negative otherwise.

### Weighting of Feedback Loops

There are 2 types of weighting that we use: weighting by length, and weighting according to sign. First, each feedback loop in the network is weighted by the inverse of its length, to obtain the total number of weighted (by length) positive feedback loops (WPFL) in the network.

Where the sum is over all positive feedback loops. The number of weighted negative feedback loops (WNFL) is calculated similarly. We then assign differential weights to feedback loops on the basis of their sign. Because negative feedback loops were generally found to be negatively correlated with our robustness metrics, they receive a negative weight, normalised to −1, and PFLs receive a positive weight (*a*). In other words, the total number of weighted feedback loops in the system (WFL) is given by

Similarly, we can weight the feedback loops just by their sign (not length) to obtain SWFL, the number of signed weighted feedback loops.

#### Fraction of positive cycles (unweighted and weighted)

The fraction of positive cycles (FPC) in the network is simply given by

Similar to the above, one can also weight the cycles (by length and sign) when calculating the fraction of weighted positive cycles, (FWPC)

So, as the WNFLs increase, the fraction decreases, thereby maintaining negative correlation between negative feedback loops and robustness as in the previous case.

#### Optimal weights for a given indicator of robustness

For a given indicator of robustness (e.g., average Perturbation JSD), the optimal weight *a* in the above 2 metrics (number of WFLs or FWPC, the *x* variable) for the perturbed networks is found, that maximises the correlation with the robustness indicator (the *y* variable). This is done using the curve fit module in SciPY package in Python 3.8.

## Data and Code Availability

The raw data generated for this study is available at https://bit.ly/3iHZBSd. Derived datasets and code supporting the current findings, including the codes for continuous boolean model, are available on the github page: https://github.com/csbBSSE/Robustness_project

## Author Contributions

MKJ conceived and supervised the research. AH, AM and KH performed the research and analyzed the results. All authors prepared the manuscript.

## Competing Interests

The authors declare no competing interests.

## Additional Information

Correspondence and requests for materials should be addressed to MKJ.

## Supplementary Supplementary Methods

### Random Circuit Perturbation (RACIPE)

RACIPE [27] is a tool used to simulate gene regulatory networks (GRNs) in a continuous fashion. For a given GRN, RACIPE constructs a set of ODEs representing the interactions in the network. For a node i, let*A*_{i} and *I*_{i} denote the set of all activating and inhibiting input nodes to i respectively. Denote the expression of node i by *E*_{i}. The dynamics of node i is given by the ODE

Where for node i, *T*_{k,i} is the threshold value of the concentration *E*_{i}, *g*_{i} is the production rate, *k*_{i} is the degradation rate, *n*_{k,i} is the hill coefficient, *λ*_{k,i} is the fold change.*H*^{+} is called the shifted hill function, given by:

RACIPE simulates these ODEs by randomly sampling parameter sets from a pre-determined range of parameters uniformly. These parameter ranges were estimated from BioNumbers. For each parameter set, the ODEs are simulated for multiple initial conditions. The parameters used for each such set of simulations and the corresponding stable states obtained in that domain are the outputs obtained.

### Discretising RACIPE data and calculating stable state frequencies

Each steady state obtained from RACIPE is assigned a weight, equal to the fraction of initial conditions that converge to the steady state for the corresponding parameter set. We denote the steady state expression of node *x* by *E*_{x}. These expression values are converted to weighted z-scores by scaling them about their means:

Where *Ē* is the weighted mean of the steady state expression of node *x*, and *σ* is the weighted standard deviation of the expression level. The weighted z-scores are then binarised by assigning a value of 1 for positive and 0 for negative weighted z-scores respectively. Hence, each steady state is a string of 1s and 0s, with the number of nodes as the length. While calculating the frequency of stable states, we weight them using the weights described earlier.

### Boolean Model

The Boolean algorithm devised by Font-Clos et al [29] is used here. Nodes have binary values of −1(low), 1(high). If *x*_{i} is the state of a node, the activation *A*_{i} is given by
where *N*_{i} is the set of nodes activating/inhibiting node *i*, and *J*_{ji} is the sign of the interaction. Nodes are updated asynchronously by choosing a node uniformly at random at every time step, and changing it to 1 if the node’s activation is positive, and −1 if it is negative. If the activation is 0, then we don’t update the node.The final steady states are represented in a 0(low), 1(high) format, for ease of readability.

### Fold change in plasticity

Fold change in plasticity is used to measure the degree of change in plasticity of the wild type upon a perturbation, either dynamic or structural. If the plasticity of the wild type network is *p*_{1} and the plasticity of the perturbed network is *p*_{2}, then:

Consequently, the fold change ranges from 0 to 1, and a higher value indicates that the perturbed network has plasticity closer to the wild type.

### Jensen Shannon Divergence

The difference between two phenotypic distributions can be quantified by the JSD metric. For two discrete frequency distributions P(x) and Q(x), we define:

Where , and D is the Kullback-Leibler divergence, given by:

JSD varies from 0 to 1 when base to is used to calculate the logarithm in D, with 0 indicating that the distributions are identical, and 1 indicating that the distributions have no overlap. The JSD values were calculated using the jensenshannon function in the scipy.spatial.distance module (Python 3.8).

### Random Network Generation

Random networks were generated uniformly across the set of all connected networks. We generated 100 random networks for each size, defined by the number of nodes in the network *n* = 4 − 10. The total number of edges in the network is decided before generation (*m*) with the constraint *m* ≥ *n* − 1, and picked uniformly from the binomial distribution:

First a random spanning tree (a graph on the *n* nodes without any cycle with *n* − 1 edges) is generated by picking edges uniformly at random, and adding them if they don’t cause a cycle, ensuring that each node is connected with another node in the network. The remaining edges (*m* −*n* + 1) are then added randomly until the desired total number of edges is reached. The random networks generated for calculating perturbation JSD have an additional condition: each node of the network must have an outgoing edge. This is to ensure that there are no output nodes that unduly change the JSD despite the core network dynamics not changing.

The network sizes we analyze for each criteria of robustness are as follows Avg. perturbation JSD: 4 − 10

Avg. Fold change in plasticity (dynamic): 4 − 8

Avg. Parameter variation JSD: 4 − 8

Avg. Fold change in plasticity (structural): 4 – 5

JSD between RACIPE and Cont. : 4 − 8

Analysis for larger networks could be performed for criteria that only use the Continuous model, but those that also use RACIPE faced computational limitations, especially structural robustness in plasticity, as multiple perturbed networks need to be simulated for each network topology.

### Percent error in Phenotypic distribution

In order to estimate the degree of variance in distribution across multiple simulations, the following metric was employed. For a given network *N*, suppose it has stable state frequencies *S*(*i*). The simulation is then done with 10 different runs, from which the error in *S*(*i*), *E*(*i*) is found for each state by finding the standard deviation. The percent error for that simulation is then defined to be

This is repeated in triplicates, and the average value of *P* is reported as the fraction error.

### Simulation Parameters

Unless mentioned otherwise, the number of simulations used to estimate phenotypic distributions or plasticity in Boolean/RACIPE/Continuous formalisms is chosen to be 10000 times the number of nodes in the network. Error bars are obtained by performing the simulations in triplicates. RACIPE was run with the default parameters, with only the number of parameter sets being varied across networks. The topofiles for each network analyzed as well as .ids files containing the order of nodes in the state are present in the Github repo.

### Statistical tests and functions

All correlation analysis was done using Spearman correlation method using the scipy.stats.pearsonr module. The corresponding statistical significance values are represented by ‘^{∗}’s, to be translated as: ^{∗} : *p <* 0.05, ^{∗∗} : *p <* 0.01, ^{∗∗∗} : *p <* 0.001. Unpaired T-test for the violin plots was performed using the scipy.stats.ttest ind function, with significance being reported as above. Regression bands (95% CI) for correlation plots were plotted using seaborn.regplot.

## Supplementary Figures

## Acknowledgements

AH and AM are supported by the KVPY fellowship awarded by Department of Science and Technology (DST), Government of India. KH is supported the the Prime Ministers Research Fellowship (PMRF). MKJ is supported by the Ramanujan Fellowship (SB/S2/RJN 049/2018) awarded by the Science and Engineering Research Board (SERB), DST, Government of India. Mr. Atchuta Srinivas Duddu is acknowledged for artwork **(Fig 1a,e**, **Fig 3a)**.