We designed a novel version of a single-shot trust game, the Information Sampling Trust Game (ISTG). Participants completed the ISTG in the investor role. On each trial participants (*n* = 37 (of which 12 men), age *m* = 22.95, *sd* = 3.71, range = 18–34) were endowed with €6 which they could either keep to themselves or invest in another player (the trustee). Not investing had no consequence and led to the next trial. Upon investing, the trustee received the endowment multiplied by 4 and would subsequently decide between either reciprocating (50–50 split) or defecting (keeping all €24). Crucially, before deciding whether or not to invest, participants were given the opportunity to sequentially gather information about a trustee’s previous reciprocation history by turning tiles in a 5 × 5 grid (Fig. 1). If a tile turned green the trustee had reciprocated money to a previous investor. If it turned red they defected to yet another investor. Deciding to either invest or not invest led to a new trial. Participants were told that there was a different trustee on each trial, the ratio red to green tiles on each trial could therefore vary, and that the location of the tile was not informative. Investment outcomes were not shown during the task. Unknown to participants, each trustee was computer-generated. The probability of a green tile was an independent draw from a Bernoulli distribution with parameter *r*, which was pseudo-randomly drawn from six values (0.0, 0.2, 0.4, 0.6, 0.8, and 1.0). The task consisted of 240 trials (10 per reciprocation probability, which were evenly distributed over the 4 conditions). This led up to a maximum of 240 × 25 tiles = 6,000 decisions.

The experiment consisted of four conditions that were varied within subjects: a monetary cost of sampling (either costly or cost-free), crossed with social context (overt or covert). In the overt conditions, we told the participant in the instruction phase that after the experiment, we would randomly select three trials. For the subset of these three trials on which the participant had decided to invest, we would contact the corresponding trustees and tell them how much the participant had sampled; then, the trustees would decide to either reciprocate or defect. We told the participant that their final pay-off would be the average of their earnings on the three selected trials. In reality, the trustees’ decisions were simulated using their respective reciprocation probabilities. All methods were carried out in accordance with the Declaration of Helsinki and approved by the local ethics committee (Commissie Mensgebonden Onderzoek regio Arnhem-Nijmegen 2014/288). Informed consent was collected from all participants.

### Descriptive statistics

The main focus of our study was on the computational models, which are cognitive process models that account for the data in a theoretically motivated manner. However, in this section we first use descriptive statistics to describe the general behavioural patterns in the data before we report the computational modelling results in the sections below. To this end, we used a mixed-effects model (see supplement Table S1 for full model) assessing the effects of monetary cost, social context, outcome uncertainty (variance of the Bernoulli distribution *r(*1-*r*)), and valence-dependency (*r*) on the number of samples. There was a significant interaction between monetary cost and social context (coefficients mean ± SEM; = 0.25 ± 0.02, *p* < 0.001); participants sampled less if sampling was overt and monetarily cost-free (= 0.33 ± 0.01, *p* < 0.001) and when sampling was overt and costly (= 0.07 ± 0.01, *p* < 0.001). As expected, people sampled more when the outcome uncertainty was larger (= 1.26 ± 0.05, *p* < 0.001), i.e., they sampled more when the acquired information was relatively inconclusive (as is the case when *r* is closer to 0.5, Fig. 1C). Outcome uncertainty interacted with the reciprocation probability *r* (= 0.55 ± 0.19, *p* = 0.003), which suggests that people sampled more when outcome uncertainty and reciprocation probability were high. We further explored whether the number of samples drawn depended on valence by using Bonferroni corrected Wilcoxon signed-rank tests between symmetric pairs of generative probabilities. This revealed that people sampled more when *r* = 0.8 than when *r* = 0.2 (median difference = 0.775, *p* < 0.001) but not for *r* = 0.4 compared to *r* = 0.6 (median difference = 0.350, *p* = 0.11) or for *r* = 0 compared with *r* = 1 (median difference = 0.40, *p* = 0.013).

We then used a separate logistic regression to test whether the decision to invest was predicted by *r* and the conditions. This logistic regression returned a coefficient = 9.120.18 (*p* < 0.001) for *r*, indicating that the probability of investing increased with a higher *r*. This confirms that the acquired information was used in the investment decisions. Monetary cost, social context and their interaction were not significant predictors of the decision to invest (monetary cost: = -0.0230.095, *p* = 0.81; social context: = 0.0110.096, *p* = 0.91; interaction between monetary cost and social context: = 0.1130.135, *p* = 0.40).

After task completion, participants were asked to indicate whether they believed that when sampling was overt to the trustee, more sampling made reciprocation more likely, less likely, or would remain the same. Believing that overt information sampling would make reciprocation less likely was the most commonly reported answer (test for non-uniformity: χ^{2}(2, *n* = 37) = 19.28, *p* < 0.001, Fig. 1D). Some subjects who responded in the less common category (that overt sampling would increase the probability of reciprocation) in fact also sampled less in overt compared to covert conditions (Fig. S1). Self-reports alone are therefore not sufficient to study the effects of overt sampling. Overall, the sampling data and the self-reports together suggest a social sampling cost of potentially leaving a negative impression on the trustee. This is consistent with the intuition that if someone continually checks up on our reliability, it may make us less likely to behave pro-socially with that person in the future.

### Computational models

To understand the cognitive mechanisms by which people sample information and the process by which overt sampling affects this mechanism we fitted normative and heuristic computational models. Each model describes different processes for how sampling is affected by the social and monetary cost conditions. We describe the intuition for these models and their differences here (see Supplement for formal descriptions). The first three models are Bayesian in the sense that the agent computes a posterior belief distribution over the trustee’s probability of reciprocation (Fig. 2).

We first consider a model which, analogous to the self-reports, assumes that every sample reduces the reciprocation probability by a constant factor. We refer to this model as the *Cost of Negative Impressions* (CNI) model. The model is normative in the sense that it maximizes expected utility. For every possible state (combination of red and green tiles), the agent uses the posterior belief to calculate the expected utility for every action: sampling, investing, and not investing. We derived the expected utilities of all state-action pairs using the Bellman equations and dynamic programming^{8}. In the overt sampling conditions, the value of investing takes into account the factor ω (these are two free parameters: one for overt sampling under monetary cost and one for overt sampling without monetary cost) by which the agent believes the trustee’s reciprocation probability will decrease with each sample that is drawn. The model accounts for the immediate subjective cost of sampling, *c* (two free parameters: one for monetary cost condition and one for sampling without monetary cost as the latter may be non-zero due to the effort and time it takes to sample which can also be interpreted as a cost). We allow for two deviations from optimality as suggested by the (iterated) trust game literature: subjective prior beliefs^{9} and betrayal aversion^{10}. The CNI model improved in fit to the data when these parameters were added (Table S2).

We refer to the second model as the *Sample cost* model. It is highly similar to the CNI model but more parsimonious in the sense that the agent simply thinks of all conditions as having an immediate subjective sampling cost, *c* that differs in weight for each of the four conditions. Similar to the CNI model we additionally tested the improvement in model fit when free parameters for a subjective prior belief and betrayal aversion were added (Supplement). While computing the value of information for each possible state in the task has a high precision, forward-reasoning models like the CNI model and the Sample cost model are likely too computationally expensive to be implemented as cognitive models. The brain might instead use a “good enough”, simpler heuristic strategy^{11,12}. We therefore also examined the model fits of such computationally simpler strategies.

Our third model, the *Uncertainty model* reflects such a simpler, heuristic strategy. Similar to the CNI and Sample Cost models, the Uncertainty model uses a belief distribution over trustworthiness, which updates with each sample (Fig. 2). In the Uncertainty model, the agent continues sampling until uncertainty about trustworthiness—as measured by the standard deviation of the posterior belief distribution—drops below a level that they find tolerable. We refer to this uncertainty tolerance level as the criterion, *k*, which is a free parameter per condition. Note that uncertainty reduces faster when sample outcomes are consistent compared to when they are inconsistent (Fig. 2). We again tested the improvement in model fit when free parameters for a subjective prior belief were added. Here, the behavioral effect of betrayal aversion can be captured by the combination between the subjective prior and the criterion parameter *k*. Therefore, no additional betrayal aversion parameter was added.

Fourth, the *Threshold model* has an intuition similar to that of the standard Drift Diffusion Model^{13}. Here, we consider the hypothesis that people do not use Bayesian posterior beliefs but instead maintain criteria for when they view a trustee’s behaviour as trustworthy or untrustworthy, and sample until the evidence meets one of those criteria. This requires keeping track of the sample outcomes in favour of investing and not investing. The decision to stop sampling information is then determined by whether their difference is sufficiently large, i.e., when the difference reaches a bound *b*. We allowed the bound to vary between the four conditions. It should be noted that the Threshold model is not equivalent to the DDMs that are often used in perception studies^{13} for the following reasons: First, in perception studies, the noise is typically Gaussian internal noise, whereas in our study, it is Bernoulli noise associated with past investment outcomes. Second, in perceptual applications, the time scale is hundreds of milliseconds to seconds, whereas here, accumulation takes place over a much longer time scale (tens of seconds). Finally, accumulation of evidence in regular DDMs is passive, whereas the Threshold model describes a process in which the agent decides at every time step. Based on the DDM literature, we tested versions with asymmetric bounds and collapsing bounds (Supplement).

### Computational modelling results

We compared the best fitting version of each model. For the CNI model and Sample cost model, these included a risk attitude term and a prior belief. For the Uncertainty model, we included a prior belief, and collapsing bounds for the Threshold model. The models were fitted to the data at the individual level using a log likelihood optimization algorithm as implemented in the fmincon routine in MATLAB (^{©}Mathwork). The optimization was iterated 100 times with varying initiations to avoid local minima. Model recovery showed that the models were distinguishable (Table S3).

Overall, the Uncertainty model fitted best compared with the other models (Fig. 3; also see Supplement Table S2). The Threshold model fitted worse than all other models. Moreover, to test whether different individuals follow different models, we used Bayesian Model Selection^{14,15} (Table 1). This returned strong evidence in favor of the Uncertainty model as the most likely model in the population. It suggests that people use a heuristic rather than normative, forward-reasoning models for their decisions to sample. In addition, the better fit of the Uncertainty model over the heuristic Threshold model suggests that people use a posterior distribution over *r*, instead of using a non-Bayesian proxy in their sampling decisions.

Next, we examined the parameter estimates of the winning Uncertainty model. These showed differences between conditions. Specifically, the criterion estimates confirmed that people were more tolerant to uncertainty when the trustee was informed of the sampling, and when sampling was monetarily costly (Wilcoxon signed-rank test all *p* < 0.005, see Table S3). The comparison between informed and not informed trustees reached significance when samples were monetarily cost free but not when they were monetarily costly. On average, people maintained prior means that were slightly negatively biased (median prior estimate = 0.475, Wilcoxon signed-rank returned: Z = -2.738, *p* = 0.006). This bias allowed the model to account for the empirical finding that people sampled more when the generative *r* was larger than 0.5 compared to when it was smaller than 0.5.

In sum, model comparisons demonstrated that people sample until uncertainty drops below a subjective uncertainty criterion. This subjective criterion depended on the posterior belief distribution over reciprocation probabilities. Interestingly, the Threshold model fitted least well, even when using collapsing or asymmetric bounds. This supports our interpretation that people use a posterior distribution for their sampling decisions, because the Threshold models represent simple heuristics, similar to the Uncertainty model but without a posterior belief distribution.

We further examined whether the winning Uncertainty model could also predict trust decisions (after sampling has concluded) by using the parameter estimates. We fitted the expected utilities that resulted from the Uncertainty model to the trust decisions, allowing for bias and decision noise temperature (Eq. 9). This showed that the Uncertainty model significantly predicted of the probability of trusting (Uncertainty model *p* < 0.001, Nagelkerke pseudo-R^{2} = 0.882). However, so did all other models (CNI model *p* < 0.001, Nagelkerke pseudo-R^{2} = 0.643; Sample Cost model *p* < 0.001, pseudo-R^{2} = 0.832; Threshold model, *p* < 0.001, Nagelkerke pseudo-R^{2} = 0.733), suggesting that all models could predict trust decisions after sampling had concluded.

### Study 2

To test the robustness of our findings under variations of the distribution of the reciprocation probability, we conducted a second, independent study (*n* = 75) with biased generative distributions of *r*. The experimental procedure was identical to study 1, apart from the fact that participants sampled over either positively biased (*r* = 0.2, 0.4, 0.6, 0.8, and 1.0), or negatively biased (*r* = 0.0, 0.2, 0.4, 0.6, and 0.8) generative probabilities of reciprocation. We examined the number of samples as a function of bias, monetary cost, social context, outcome uncertainty, and reciprocation probability in a mixed-effects model (see Supplement for the model specification and full results table). We replicated all effects in study 1. However, here we found that the effect of social context was present when sampling was free (= 0.2000.016, *p* < 0.001), but was not significant when it was costly (= 0.0200.016, *p* = 0.225; Fig. 4C,D). The subjective reports in study 2 also replicated the pattern in study 1 (Fig. 4, Table S6). For the computational models, within-model comparisons replicated for all models except for the Threshold model, where asymmetric bounds improved the model fit (Supplement). The between-group variational Bayesian analysis showed that the winning model did not differ between the positive and negatively biased groups (probability that the two groups have the same model frequencies = 0.930). Between model comparisons suggested that the Uncertainty model performed best (Figure A and B also see Table S7) and the uncertainty tolerance parameter estimates varied as a function of condition further replicating the results of Study 1 (Table S8).

We found the following expected frequencies of the models in the population: 0.15 for the CNI model, 0.233 for the Sample Cost model, 0.533 for the Uncertainty model and 0.08 for the Threshold model. The exceedance probabilities returned 0.00 for the CNI model, 0.00 for the Sample Cost model, 0.99 for the Uncertainty model and 0.00 for the Threshold model. This again shows that the Uncertainty model is more likely than the other two models. The fitted models predicted the trust decisions after sampling had concluded: Uncertainty model (*p* < 0.001, Nagelkerke pseudo-R^{2} = 0.64); the CNI model (*p* < 0.001, Nagelkerke pseudo-R^{2} = 0.40); Sample Cost model (*p* < 0.001, Nagelkerke pseudo-R^{2} = 0.15); the Threshold model (*p* < 0.001, Nagelkerke pseudo-R^{2} = 0.69).