Study sites and animals
Acoustic and behavioural recordings were performed on 30 Southern white rhinoceroses (Ceratotherium simum simum) that were kept in seven groups at different zoological institutions: Zoo Osnabrück (N = 4), Zoo Augsburg (N = 4), Zoopark Erfurt (N = 3), Serengeti-Park Hodenhagen (N = 9), ZOOM Gelsenkirchen (N = 3), Zoo Schwerin (N = 3) and Zoo Münster (N = 4). All animals were observed in their outside enclosures with the whole group. Each group consisted of one adult bull, two to four adult cows and in two zoos there were additional juvenile and subadult animals (Zoo Münster: one calf, Serengeti-Park Hodenhagen: three calves and one subadult female; Table 1).
Video and audio recordings of the animals were taken simultaneously over an average period of 19 observation days at each zoo. All individuals were recorded using focal animal sampling56. Each focal individual was observed for ten minutes in a randomised order within groups, resulting in 20–40 min of observation time per individual distributed between 8 am and 6 pm. In total, 250 h of data were recorded and analysed: 40 h at Zoo Osnabrück (April 2014), 40 h at Zoo Augsburg (July/August 2014), 33 h at Zoopark Erfurt (April/May 2015), 37 h at Serengeti-Park Hodenhagen (April/May 2015), 30 h at ZOOM Gelsenkirchen (August/September 2015), 30 h at Zoo Schwerin (April/May 2018) and 40 h at Zoo Münster (July/August 2018).
Video recordings were made using a digital camcorder (Sony DCR-SR36E, Sony Corporation, Tokyo, Japan). Audio recordings were made using an omni-directional microphone (Sennheiser MKH 8020, Sennheiser electronic GmbH & Co. KG, Wedemark-Wennebostel, Germany; flat frequency response from 10–20,000 Hz ± 5db) that was equipped with a wind shield and a boom pole. The microphone was connected to a digital recording device (Sound Devices 702 T State Recorder, Sound Devices LLC, Reedsburg, USA; frequency response: 10–40,000 Hz; settings: 44.1 Hz sampling rate, 16 Bit, uncompressed .wav format).
Vocal and behavioural analysis
Video recordings were synchronised with respective audio recordings and analysed using the Observer XT software (version 12, Noldus Information Technology, Netherlands57). The analysis was conducted by three different observers (CL: Erfurt, Gelsenkirchen; BC: Osnabrück, Hodenhagen, Schwerin; JJ: Augsburg, Münster). The Cohen’s Kappa coefficient was determined among the observers by comparing 15 pilot observations (total of 100 min). All Ƙ values were ≥ 0.95, indicating a high interrater reliability58.
Vocalisations were detected by auditory identification and categorised according to the literature46,52,53. For each vocalisation, the respective call type and sender were noted. Vocalisations that due to ambient noises could not be reliably assigned a call type or a sender were excluded from the analysis. For a sufficient data basis, only call types that were uttered by at least 50% of adult senders were considered for further analysis. Consequently, the following four call types were included (Fig. 1): Hiss (N = 1,169), Grunt (N = 75), Pant (N = 83) and Snort (N = 1,027).
For each behaviour proximity measurements were coded, defining the distance of the focal animal to present group members. Due to practical as well as methodical reasons, adult body length (2.5–3 m46) was taken as a measuring unit. At one adult body length the distance could be reliably detected on video recordings, while allowing immediate social interactions between group members, therefore enabling the recording of behavioural context. Two distance categories were established: (1) close proximity (≤ 1 body length) and (2) distant proximity (> 1 body length). This differentiation ensured a reliable distinction that was not impaired by the distortion of the camera angle or by neighbouring individuals at great distances not being caught by the camera. For each focal animal the duration they spent in close proximity (≤ 1 body length) to each group member was noted. All group members in close proximity to the sender during vocalisations were considered as potential receivers.
For each focal animal the occurrence of affiliative, aggressive and defensive interactions (see ethogram in Table 2) and the respective interaction partners were noted. Affiliative interactions included social exploration of the interaction partner as well as socio-positive and sexual behaviour. Aggressive interactions were coded when the focal animal displaced, attacked, chased, pushed or clashed horns etc. with the interaction partner whereas defensive interactions were coded when the focal animal avoided or escaped from the interaction partner.
All statistical tests were calculated in RStudio (version 3.5.5,59). The significance level was set at p ≤ 0.05, p < 0.1 was considered a statistical trend.
Communication networks and directionality
We constructed vocal communication networks for each call type in all groups based on the dyadic call rates for each dyad in the respective group. As the number of calls emitted to a potential receiver was dependent on the time the dyad spent in close proximity (≤ 1 body length) to each other, the daily dyadic call rate was calculated by dividing the number of calls the focal animal uttered to a potential receiver by the total duration the focal animal spent in close proximity (contact time) to the potential receiver on each observation day. Based on this, the dyadic call rate was calculated as a mean of the daily dyadic call rate.
Dyadic call rates were converted into directed adjacency matrices that were used to generate communication networks for each call type in all groups. Networks were drawn in Gephi (version 9.0.2,60). In the networks each individual was represented by one node and the ties between them indicated how the nodes related to one another. Nodes were set to a random position. The thickness of the ties represented the dyadic call rates, while the colour of the ties corresponded to the colour of the sender. In a network, a node indegree was defined as the total number of ties (group members) that was directed towards an individual, while the node outdegree was the total number of ties that originated from the respective individual22,61,62. Correspondingly, weight indegree was considered the sum of all the ties’ weights (sum of dyadic call rates) that were directed towards an individual, while the weight outdegree was the sum of all the ties’ weights that originated from the respective individual61,62.
Communication networks included all animals available in the groups. However, for further analysis we excluded juvenile individuals (≤ 2 years, N = 4), as we expected sex-specific differences to start developing strongest in individuals that were weaned and not depending on the mothers anymore.
To investigate communication directionality, a Pearson correlation between the node in- and outdegrees was performed for each call type using the ‘cor.test’ function. Non-significant correlations were categorised as asymmetrical directionality and significant correlations as symmetrical directionality.
Sex-specific sender differences
To assess sex-specific differences in general vocal activity, we calculated the overall call rate for each focal animal and for each call type by dividing the number of calls by the total observation time of the respective focal animal.
To investigate sex-specific differences in the proportion of incoming and outgoing call parameters, we calculated the ratio between node in- and outdegree (ION) and the ratio between weight in- and outdegree (IOW). ION was calculated by the number of node indegrees minus the number of node outdegrees divided by the sum of node in- and outdegrees. Correspondingly, IOW was calculated in the same manner by using the weights of the ties. Index values ranged between − 1 and + 1, with 0 indicating equal distribution of incoming and outgoing parameters. Accordingly, in individuals with negative index values, the outgoing degree was higher than the incoming degree, while in individuals with positive index values, the outgoing degree was lower than the incoming degree (Supplementary 1).
Sex-specific sender differences in overall call rates as well as in ION and IOW indices were determined by calculating linear mixed effects models (LMEs, ‘nlme’ package, ‘lme’ function,) using ‘sex of the sender’ as predictor variable while controlling for ‘zoo’ as random factor. Residuals were calculated for all LMEs performed in the study (‘resid’ function) and subsequently verified for normality as well as for homogeneity of variances.
Effects of dyad type, quality of social interaction and association strength
To determine the quality of social interactions, interaction rates for affiliative, aggressive and defensive social interactions were calculated for each focal animal-interaction partner dyad in a group. Thus, daily interaction rates were calculated by dividing the number of a) affiliative, b) aggressive or c) defensive interactions of the focal animal with an interaction partner by the total observation time of the focal animal on that day. Based on this, the interaction rate was calculated as the mean of the daily interaction rates.
To assess the association strength (AS) between dyad partners as a measurement for the level of social cohesion, the daily association index was calculated for each dyad by dividing the sum of the duration animal A spent in close proximity to animal B and animal B spent in close proximity to animal A by the sum of total observation times of both animals on that day. Based on this, AS was calculated as the mean of the daily association indices. AS values ranged from 0 to 1 with 1 indicating the strongest possible association, the dyad spending 100% of their observation time in close proximity to each other.
We further investigated whether dyadic call rates can be predicted by the sex composition of the dyad (dyad type: female-female (N = 40), female-male (N = 19), male–female (N = 19)), the quality of social interactions (affiliative, aggressive, defensive) or the association strength between both dyad partners. To determine this, LMEs were calculated for each call type using the dyadic call rate as response variable, two-way interactions between dyad type and social interaction type or between dyad type and association strength as predictor variables (dyad type*affiliative interaction rate, dyad type*aggressive interaction rate, dyad type*defensive interaction rate, dyad type*association strength), while controlling for “sender” and “zoo” as random factors. The best fitting model (final model) was determined via backward stepwise elimination procedure (‘car’ package, ‘Anova’ function). If the interaction term was not significant, only the main factors remained in the final model. To investigate significant effects of the main factor dyad type, a comparison across the three dyad types was conducted (‘lsmeans’ package, ‘lsmeans’ function, ‘Tukey’ adjustment for multiple comparisons). If an interaction term turned out to be significant, a break down analysis was performed by splitting the dataset according to dyad type. In the result section, only final models were reported along with slope β values indicating the coherence between call rate and the quality of social interaction or association strength.
The article contains only observational data of zoo animals during their daily routine. No animal was taken out of its usual environment or physically manipulated by the authors. The authors received permission to record the animals’ data on the ground of the respective zoo.