To analyze the robustness of Twitter vs. Gab to targeted attacks, that is, the structural consequences of the removal of certain users from the network according to specific criteria, we had to select a topic that would provide the criterion to identify users in terms of their participation in the online debate, of the inflammatory content of their posts, of their structural position in the online community, and so on. We chose to focus on posts referring to the current president of the United States of the America, Donald Trump, and containing the following keywords: ‘trump’, ‘potus’. The choice of Donald Trump as the reference topic is due to the high salience and popularity of this public figure in online social media debates, which are typically characterized by highly polarizing and inflammatory content, thus making of it an ideal test bed for our analysis^{38}.

For Gab we consider data from a time window spanning 3 months of 2018, gathering a total of almost 450k posts, has been made publicly available from https://pushshift.io/. As Twitter is much more populated than Gab and characterized by much higher volumes of online activity, here we collected content in a time window of two months in 2018 (overlapping with Gab’s time window), gathering a total of almost 45M posts. More detailed information and technical specifications may be found in the “Methods” section.

To reconstruct the behavioral networks from the online social platforms, we represent users as nodes and interactions between users as links of a social network. For both Twitter and Gab, we considered two different types of networks, capturing different facets of social interactions. The first one is replies: whenever a user replies to the message of another user, a link between the two respective nodes is established. The second one is mentions: whenever a user mentions another user in one message, a link between the two respective nodes is established. We consider the networks as un-directed, i.e., no information is encoded about who is the sender and who is the receiver of the messages, and unweighted, i.e., no information about the number of interactions. These approximations are appropriate as far as one is mainly interested in the potential pathways through which information could flow after applying the removal of users. So the assumption is that, although there is always a source and a receiver in the communication, when both users interact they are symmetrically aware of the interaction and of the existence of each other. If, on the contrary, one is interested in the most probable information pathways or on influence maximization problems, then the direction and weights of the links should not be avoided. For the sake of simplicity, we also consider the networks time-aggregated, i.e., the information about the timing of the interactions is not taken into account.

Replies and mentions correspond to two different aspects of social interaction^{39}. In the case of replies, users are engaged in an active conversation between them, which can also be unilateral if one user responds but the other does not in turn. In the case of mentions, one user is pointing attention of other users toward a third user, but the mentioning and mentioned users need not be engaged in a direct conversation between themselves. Replies are therefore part of a dyadic interaction as in a typical conversation (where clearly one user may establish several conversations at the same time if multiple other users reply to his/her posts), whereas mentions are typically part of a multi-lateral conversation that may intrinsically involve several users at the same time, and even be targeted to reach an indefinite number of users.

On this basis, it is possible to construct analogous networks for Twitter and Gab users replying to, or mentioning, other users in messages that contain Donald Trump related hashtags. The sizes of the networks are *N* = 7,103 and 19,719 for the Gab replies and mentions, respectively, and *N* = 1,429,509 and 3,476,066 for the Twitter replies and mentions, respectively. The purpose of our analysis is to investigate to what extent such networks are resilient to different types of attacks consisting of the removal of a number of users with specific characteristics, on the basis of the different nature and characteristics of the two online social networks in terms of moderation.

### Robustness and topological properties

We study the robustness of the reconstructed networks by means of percolation theory^{40}. The basic procedure is to delete nodes according to a given criterion and, as the process unfolds, compute several properties of the damaged system. In the context of online social networks, node deletion can be identified, as already hinted, with the banning or temporary inhibition of a specific user, see Fig. 1. A good topological proxy to assess the robustness of the network is the largest connected component (LCC), which is the largest connected sub-network remaining after user removal. When the normalized size of the LCC, *S*, is close to 0, the network is completely disintegrated in many small clusters, and therefore there is no possibility to observe propagation of information at a global scale. On the contrary, when *S* is close to 1, the removal of nodes barely affects the overall topology, and information can potentially flow between almost every pair of actors. The passage from (S ne 0) to (S = 0) is called the percolation phase transition^{41}, and the exact value of the fraction of removed nodes for which the size of the LCC becomes null is called the percolation point. The theory of percolation assumes that we deal with infinite systems, hence the percolation phase transition and the percolation point are well-defined only in this regime. In a finite system, though, the size of the largest connected component is exactly 0 only when all nodes are removed. Although the drop from (S sim {mathcal {O}}(1)) to (S simeq 0) tends to be relatively abrupt and localized in finite systems, there is no general criterion to identify the percolation point directly from the size of the LCC. An alternative and accurate method, which is the one we employ, is to identify the percolation point with the fraction of removed nodes for which the size of the second largest connected component is maximum. Roughly speaking, a network is considered robust when it can handle a large amount of node removals without being disintegrated.

The most basic procedure to study network robustness is the random removal of nodes, which is equivalent to the classic percolation process with degree heterogeneity^{42}. Random attacks are known to be a poor strategy to break a network, and especially when the degree distribution is broad^{43, 44}, which is a hallmark of social networks^{45}. This is because random node selection will pick with high probability low-degree nodes, which do not play any significant role in keeping the network connected. A more effective strategy to destroy a network consists in selecting nodes by degree, and remove those with the highest degree first. In this case, the percolation point is significantly reduced^{43, 46}. This response—weak to targeted attack but strong to random failures—has been dubbed the robust-yet-fragile effect^{47}.

As an alternative to random attacks, here we perform degree-based attacks in our reconstructed networks (Fig. 1B,C), thus targeting structurally prominent users rather than randomly picked ones. To implement such attacks, we use an adaptive scheme: after each removal, we recalculate the degrees and recursively choose the node with the highest degree to be deleted next. For all networks we find, as expected, a relatively low percolation point, i.e., low robustness, a result that agrees with the idea that actors in social networks are heterogeneously connected among each other. Another quantity that is heterogeneously distributed is the size of the connected components at the percolation point. Their sizes are shown as bubble plots in Fig. 1B,C, where the radius of each bubble is proportional to the logarithm of the number of actors in the components. The cluster size probability density function is shown in Fig. 2A. These plots indicate that at the precise moment the communication in our social networks cannot be held global anymore, it is very likely to find large clusters with sizes that are very far away from the mean cluster size. Indeed, the distribution turns out to be power-law (p(s)propto s^{-alpha }), with exponent (alpha) smaller than 3, a signature of infinite variance. This property only holds at the percolation point, whereas away from it the finiteness of the variance is recovered due to an exponential cutoff in the component size distribution.

We observe that the presence or absence of content control in the online social network does not induce a clear robustness pattern. Indeed, in the network of replies the percolation point for Gab is larger than the one for Twitter, whereas in the network of mentions the result is reversed. This leads us to conclude that content policies cannot be directly correlated to robustness assessments in online social networks. To better understand the response to degree-based attacks, we need to shed light on the topological properties of the networks. The first thing to note is that since for uncorrelated networks, i.e., those networks where the degree of an actor is independent of her neighbor’s degree, the percolation point is known to only depend on the first and second moment of the degree distribution^{48}, and the degree distribution for all networks we considered is very similar (Fig. 2B), the reason for the variability in the robustness must therefore be hidden in topological correlations. A topological correlation that is known to affect the percolation point is the mixing assortativity^{49}, known as homophily in the social sciences^{50}, which reflects the tendency of actors to interact with peers with similar characteristics. At the topological level we have the degree assortativity, which is positive when nodes are mainly connected to other nodes of similar degree, and is negative when the opposite is found. A broadly used measure to capture the level of homophily is the assortativity coefficient^{45}

$$begin{aligned} r = frac{langle q q’ rangle _l – langle q rangle _l langle q’ rangle _l}{langle q^2 rangle _l – langle q rangle ^2_l}, end{aligned}$$

(1)

where (langle cdot rangle _l) denotes average over all links, and *q* and (q’) are the excess degrees of the nodes at the end of a link, that is, the number of edges attached to a node other than the one we reached out to. *r* is nothing else than the Pearson correlation coefficient of the degrees at either end of an edge, and is normalized within the interval ([-1, 1]). The assortativity coefficient is appealing because it encodes the correlations into a single number, which is usually employed to infer the robustness of the network. The percolation point of a network with positive (negative) assortativity is higher (lower) than the percolation point of a network with (r=0), assuming the same underlying degree distribution in all cases^{51}. The larger the assortativity coefficient (in absolute value), the greater the separation with respect to the percolation point of (r = 0). In our case, we find that for the network of replies (r_{gab} = -0.16 < r_{twitter} = -0.05) and for the network of mentions (r_{gab} = -0.26 < r_{twitter} = -0.04) (Equation 1 is not a useful expression when it comes to compute the assortativity coefficient directly from degree sequences. We have used, instead, a much more convenient expression, namely, (r = frac{sum _{ij} (A_{ij} – k_i k_j/2m)k_i k_j }{sum _{ij} (k_idelta _{ij} – k_i k_j/2m)k_i k_j})). The values of the assortativity coefficient agree well with the robustness patterns in the networks of mentions, but not with those in the network of replies. Hence, the information brought by *r* is not sufficient to successfully explain the response of our system to degree-based attacks.

The implicit assumption behind the assortativity coefficient is that the degree correlation—the mean degree of the neighbors of all degree-*k* nodes (langle k_{nn}(k) rangle = sum _{k’} k’ P(k’|k)), where (P(k’|k)) is the conditional probability that following a link of a *k*-degree node we arrive at a degree-(k’) node—has a linear dependence on the degree, with slope *r*. If (langle k_{nn}(k) rangle sim r k) does not apply, the value (and even the sign) of *r* can be misleading, as well as the correct interpretation of the location of the percolation point as a function of *r*. We show in Fig. 2C,D that the linear assumption does not hold, although in all cases we observe a monotonically decreasing tendency, concluding that all networks are dis-assortative. One could argue that in the network of replies, Twitter’s (langle k_{nn}(k) rangle) decays faster than in Gab, thus ensuring a higher robustness to the latter. This very same argument, though, fails when applied to the network of mentions.

At this point we must consider higher-order topological correlations to have a satisfactory explanation of why the mention networks do not respond as expected from the assortative profiles. To this purpose, we use the *k*-shell decomposition^{52}, which conveys information about the hierarchical structure of networks. The concept of *k*-shell is intimately related to the one of *k*-core^{53, 54}, which is the sub-network that survives after removing all nodes with degree less than or equal to *k* from the original network, see Fig. 3A. The *k*-shell is the subset of nodes that belong to the *k*-core but not to the ((k+1))-core. A network representation in *k*-shells offers a qualitative way to assess the connectivity and clustering inside *k*-shells, and the degree-shell correlation, i.e., how hubs are central with respect to their *k*-coreness, see^{55} for further details. In *k*-shell decomposition one plots a series of concentric circumferences, the smaller the diameter, the larger the shell index. On each circumference one places as many markers as nodes belonging to that shell index, with the marker size proportional to the node’s degree. We can indicate the nodes that were connected in the original graph by grouping them together on the circumference, hence leading to a heterogeneous angular distribution of nodes. The most important information for our discussion is the fact that hubs, indicated by larger markers in the first rows of Fig. 3B,C, are mainly located in the largest *k*-shell for both Twitter networks, and in the replies of Gab. For Gab mentions, on the contrary, there is a larger density of hubs across *k*-shells (first row of Fig. 3B). The presence of hubs in different shells denotes a low degree-coreness correlation; put otherwise, there is an important number of highly connected nodes at the periphery of the network.

In order to quantitatively support the qualitative observation of the degree-coreness correlation, we propose here to inspect the connections among *k*-shells. Let *K* be the maximum *k*-shell index. We can compute a symmetric matrix of dimension (K times K), whose elements (*i*, *j*) are the number of links between nodes of *k*-shells indexes *i* and *j*. By plotting this matrix as a heat map different patterns can emerge, and they can be interpreted as an indicator of the tendency of nodes to communicate with their peers of same level of *k*-coreness, hence giving a global idea of how centralized is the communication in the network that cannot be captured by the assortativity or degree correlations. See Fig. 3A for further details, where we distinguish among possible behaviors. Thus, we call centralized networks those networks in which most of the nodes in low and intermediate *k*-shells are connected only to those nodes in the largest *k*-shell. Likewise, a decentralized network is characterized by nodes of all *k*-shells connected among them with a similar density of links. We plot in the last row of Fig. 3B,C such heat maps. In both Twitter networks and in Gab replies we find that almost all inter-*k*-shell interactions occur between the nodes in the largest *k*-shell and the others, i.e., they are centralized. Surprisingly, the heat map of Gab mentions is much more homogeneously populated, indicating that each *k*-shell has a significant portion of links to distribute, and that this distribution covers all the other shells with no particular preference, i.e., the network is decentralized. In light of this inter-*k*-shell connectivity, we can explain why the Twitter mention network is more robust than the Gab one. Since in the former all large-degree nodes are very interconnected among them in the largest *k*-core, deleting them leads to a slow disintegration of the network. In the latter case, the hubs are not so well compacted in the largest *k*-core, but decentralized across different levels in the topological hierarchy, therefore deleting them by degree leads to an easier network disintegration.

### Sentiment-based attacks

So far we have analyzed the robustness of the Twitter vs. Gab networks to attacks that are based on the degrees of users in the online network, that is, on targeting the most relationally prominent individuals. Attacks of this kind test the topological properties of the network, and they are known to effectively dismantle the underlying graph, but are at the same time totally blind to any users metadata. However, there can be situations, and this is especially the case in the context of social networks, in which we would like to assess the response of the system to the removal of certain type of users, e.g., fake news generators or hate spreaders. In this section we discuss the effects of carrying out attacks based on the sentiment of the users, as obtained from the body of the messages they write. We will consider only the network of replies in this case, because it has a much higher number of active users with well-defined sentiment than the network of mentions. We characterized the amount of user’s text emotional content by a text-based analysis that distinguish three different components of emotions, namely valence, arousal and dominance (see “Methods” section). We leave for the “Supplementary Material” the same analysis for other three sentiment classifiers, where we reach identical conclusions to the ones presented below.

We perform the attacks by sorting nodes with decreasing intensity of the (numerical indicators of the) different sentiments, and deleting the nodes following that order. In other words, at each round we delete those users whose posts are the most emotionally charged (i.e., potentially most inflammatory) among those still present in the network. We display the results of this process in Fig. 4A,B. The most striking result is that the percolation point is quite large, that is, one needs to remove practically all actors to break apart the main component. Moreover, the percolation curves are similar to the typical response of a scale-free network subject to random attacks, therefore indicating that either most of the extreme sentiments are located in the low-degree regime, or that, at least, there is no significant correlation between the position of a user in a network and the sentiment of the messages s/he writes.

Figure 4C sheds some light on the interplay between topology and sentiment. We can observe that, indeed, the most extreme sentiments values are located in the area of low degrees, which explains why the sentiment-based attacks do not result in a very effective dismantling of the networks. Another particularity of the sentiment-topology correlation is that the sentiment score becomes independent of the degree as *k* grows. This is somewhat surprising, as one would expect non-trivial effects of the network structure on the sentiment distribution. A plausible explanation of this result is that, as users become more active (larger degree), their sentiment values, which are computed over all their written posts, average out resulting in a neutral sentiment value.

An important consequence of this particular sentiment-topology correlation is that when it comes to protect a network against certain type of content or type of users, blind removal proves to be a very efficient strategy at a global level, in the sense that guarantees the potential exchange of information between all the remaining pair of nodes. However, if the goal is to use sentiment information to achieve an efficiently network dismantling, the implemented strategy must be combined with some topological information. One of the options is to sort from highest to lowest the users according to a sentiment, but ignoring those below a certain degree threshold (k_{thr}). That is, for example, remove users that generate hateful content as far as these users are active above a certain threshold. With this simple modification, we see a significant decrease in the percolation point, i.e., the networks become less robust. The percolation curves for different values of (k_{thr}) are shown in Fig. 5. We indeed see that passing from (k_{thr} = 0) to 1 already reduces the percolation point by about (45%) in Gab and (40%) in Twitter, while passing from (k_{thr} = 0) to 2 leads to a reduction of about (61%) in Gab and (56%) Twitter.