(C) PLOS One

(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .

The neurocognitive role of working memory load when Pavlovian motivational control affects instrumental learning [1]

['Heesun Park', 'Department Of Psychology', 'Seoul National University', 'Seoul', 'Hoyoung Doh', 'Eunhwi Lee', 'Harhim Park', 'Woo-Young Ahn', 'Department Of Brain', 'Cognitive Sciences']

Date: 2023-12

The participants (N = 56) underwent fMRI imaging while performing an instrumental learning task under a control condition and a WM load condition (Fig 1). In the control condition, they participated in the orthogonalized go/no-go (GNG) task [45], a learning task that contained Pavlovian–instrumental conflicts. In the WM load condition, a 2-back task was added to the GNG task; the modified task was named the working memory go/no-go (WMGNG) task (see Materials and Methods for more detail).

(A) In both tasks, four fractal cues indicated the combination of action (go/no-go) and valence at the outcome (win/loss). (B) In each trial, a fractal cue was presented, followed by a variable delay. After the delay, actions were required in response to a circle, and participants had to decide whether to press a button. After an additional brief delay, the probabilistic outcome was presented, indicating monetary reward (green upward arrow on a ₩1000 bill) or monetary punishment (red downward arrow on a ₩1000 bill). A yellow horizontal bar indicated no win or loss. In the WMGNG task, the original GNG task was followed by a 2-back response and 2-back outcome phases. (C) The participants were asked to indicate whether the cue in the current trial was identical to the cue in the two preceding trials. Here, because the cue in trial 3 differed from the cue in trial 1, “DIFF” was the correct response. Similarly, because the cue in trial 4 was identical to the cue in trial 3, “SAME” was the correct response. The lines mark two cues for comparison: the purple line indicates that the cues differ, while the pink line indicates that the cues are identical.

To test the hypothesis that WM load would increase Pavlovian bias ( Fig 2D ), we quantified Pavlovian bias by subtracting the accuracy in Pavlovian-incongruent conditions (“no-go to win” and “go to avoid losing”) from the accuracy in Pavlovian-congruent conditions (“go to win” and “no-go to avoid losing”). No significant difference in Pavlovian bias was observed between the GNG (M = 0.32, SD = 0.47) and WMGNG (M = 0.32, SD = 0.40) tasks (paired t-test, t(48) = -0.02, p = 0.986, d = 0.00). However, participants could have been slower in learning the Pavlovian cue-outcome associations under WM load, as the instrumental learning became slower under WM load. Thus, it should be examined if participants expressed similar levels of Pavlovian bias in the GNG and WMGNG tasks after they learned the cue-outcome associations in both tasks. To this end, we plotted the temporal development of Pavlovian bias across trials ( S2 Fig ). We observed a delayed peak in the WMGNG compared to the GNG task, which indicates that acquiring Pavlovian associations could have taken more time under WM load. Nonetheless, we observed similar levels of Pavlovian bias in both tasks after the initial peak. Thus, we concluded that our data do not show noticeable changes in Pavlovian bias under WM load.

Next, we tested the hypothesis that WM load would decrease learning speed ( Fig 2C ). While the learning curves indicated that participants learned during both tasks, the learning curve was shallower in the WMGNG task than in the GNG task (i.e., WM load reduced learning speed and overall accuracy).

We also confirmed that participants exhibited go bias and Pavlovian bias in both tasks, thus replicating the findings of earlier studies [ 45 , 52 , 56 – 62 ]. Two-way ANOVA on accuracy, with the factors action (go/no-go) and valence (reward/punishment) as repeated measures for both tasks, revealed a main effect of action (F(48) = 6.05, p = 0.018, η 2 = 0.03 in GNG task, F(48) = 9.44, p = 0.003, η 2 = 0.04 in WMGNG task) and action by valence interaction (F(48) = 22.43, p<0.001, η 2 = 0.12 in the GNG task, F(48) = 30.59, p<0.001, η 2 = 0.10 in the WMGNG task); it showed no effect of valence (F(48) = 0.00, p = 0.99, η 2 = 0.00 in the GNG task, F(48) = 2.77, p = 0.103, η 2 = 0.01 in the WMGNG task). In both tasks ( Fig 2B ), participants exhibited superior performances in “go to win” and “no-go to avoid losing” conditions (i.e., Pavlovian-congruent conditions; blue columns) than in “no-go to win” and “go to avoid losing” trials (i.e., Pavlovian-incongruent conditions; red columns). Specifically, in the GNG task, accuracy was higher in the “go to win” (M = 0.92, SD = 0.12) than “no-go to win” condition (M = 0.69, SD = 0.35) (paired t-test, t(48) = 4.13, p<0.001, d = 0.59), and in the “no-go to avoid losing” (M = 0.85, SD = 0.13) than in the “go to avoid losing” condition (M = 0.76, SD = 0.18) (paired t-test, t(48) = 3.29, p = 0.002, d = 0.47). Similarly, in the WMGNG task, accuracy was higher in the “go to win” (M = 0.82, SD = 0.25) than in the “no-go to win” condition (M = 0.57, SD = 0.34) (paired t-test, t(48) = 4.82, p<0.001, d = 0.69), and in the “no-go to avoid losing” (M = 0.79, SD = 0.16) than in the “go to avoid losing” condition (M = 0.72, SD = 0.19) (paired t-test, t(48) = 2.51, p = 0.015, d = 0.36).

(A) Task accuracies (mean percentages of correct responses) in the GNG and WMGNG tasks show that participants performed better in the GNG task than in the WMGNG task. (B) Accuracy in each of the four trial types between the two tasks demonstrated that participants performed better in “go to win” and “no-go to avoid losing” trials (Pavlovian-congruent, blue) than in “no-go to win” and “go to avoid losing” trials (Pavlovian-incongruent, red). (C) The learning curve (i.e., the increase in accuracy across trials) was shallower in the WMGNG task than in the GNG task. Note that moving average smoothing was applied with filter size 5 to remove the fine variation between time steps. Lines indicate group means and ribbons indicate means ± standard errors of the means. (D) Pavlovian bias was calculated by subtracting accuracy in Pavlovian-incongruent conditions (“no-go to win” + “go to avoid losing”) from accuracy in Pavlovian-congruent conditions (“go to win” + “no-go to avoid losing”). No significant difference in Pavlovian bias was observed between the GNG and WMGNG tasks. (A)-(B), (D) Black dots indicate group means and error bars indicate means ± standard errors of the means. Gray dots indicate individual accuracies; lines connect a single participant’s performances. Asterisks indicate the results of pairwise t-tests. **** p < 0.0001, *** p < 0.001, ** p < 0.01, * p < 0.05.

Imposing extra WM load with a 2-back task led to a decrease in task accuracy. Participants performed better in the GNG task (M = 0.80, SD = 0.12) than in the WMGNG task (M = 0.72, SD = 0.16), as illustrated in Fig 2A (paired t-test, t(48) = 3.86, p<0.001, d = 0.55). Participants’ performance decreased both in the Pavlovian-congruent ("go to win” and “no-go to avoid losing”) and Pavlovian-incongruent (“no-go to win” and “go to avoid losing”) conditions ( S1A Fig ).

Computational modeling: WM load influences learning rate and irreducible noise

We used a computational modeling approach to test the three hypotheses. For this purpose, we developed eight nested models that assumed different learning rate, Pavlovian bias, or irreducible noise parameters under WM load. These models were fitted to the data using hierarchical Bayesian analysis, then compared using the leave-one-out information criterion (LOOIC), where a lower LOOIC value indicates better out-of-sample predictive accuracy (i.e., better fit) [63]. Importantly, the use of computational modeling allowed us to test our hypothesis that WM load would increase random choices; this would have not been possible if we had performed behavioral analysis alone.

Based on earlier studies [45,64], we constructed a baseline model (model 1) that used a Rescorla-Wagner updating rule and contained learning rate (ε), Pavlovian bias, irreducible noise, go bias, and separate parameters for sensitivity to rewards and punishments (Materials and Methods). In the model, state-action values are updated with the prediction error; learning rate (ε) modulates the impact of the prediction error. Reward/punishment sensitivity (ρ) scales the effective size of outcome values. Go bias (b) and cue values weighted by Pavlovian bias (π) are added to the value of go choices. Here, as the Pavlovian bias parameter increases, the go tendency increases under the reward condition whereas the go tendency is reduced under the punishment condition; this results in an increased no-go tendency. Computed action weights are used to estimate action probabilities, and irreducible noise (ξ) determines the extent to which information about action weights is utilized to make decisions. As irreducible noise increases, action probabilities will be less reflective of action weights, indicating that action selection will become more random.

In models 2, 3, and 4, we assumed that WM load affects only one parameter. For example, in model 2, a separate Pavlovian bias parameter (π wm ) was assumed for the WM load condition. Models 3 and 4 assumed different learning rates (ε wm ) and irreducible noise (ξ wm ) parameters in their respective WM load conditions. In models 5, 6, and 7, we assumed that WM load would affect two parameters: model 5 had different Pavlovian bias (π wm ) and learning rate (ε wm ); model 6 had different Pavlovian bias (π wm ) and irreducible noise (ξ wm ); and model 7 had different learning rate (ε wm ) and irreducible noise (ξ wm ). Finally, model 8 was the full model, in which all three parameters were assumed to be affected by WM load.

The full model (model 8) was the best model (Fig 3A and S2 Table). In other words, participant behavior could be best explained when separate parameters were included for Pavlovian bias, learning rate, and irreducible noise parameters. Next, we analyzed the parameter estimates of the best-fitting model; we focused on comparing the posterior distributions of the parameters that were separately fitted in the two tasks (Fig 3B). The parameters were considered credibly different from each other if the 95% highest density intervals (HDI) of the two distributions showed no overlap [65]. Fig 3B illustrates that Pavlovian bias was not credibly different between the two tasks, consistent with our behavioral results that failed to show a change in Pavlovian bias under WM load. Conversely, the learning rate was credibly lower, while irreducible noise was credibly greater in the WMGNG than in the GNG task. These results support our hypotheses that WM load would reduce learning rate and that it would increase random choices. While the best model was the full model that assumed separate Pavlovian bias in the two tasks, no credible group difference was observed between these parameters. This is presumably because the full model was able to capture individual variations among participants (S5 Fig), despite the lack of credible difference in the group-level estimates between the two tasks. As expected, the 95% HDIs of go bias, reward sensitivity, and punishment sensitivity did not include zero, indicating that the participants exhibited go bias and reward/punishment sensitivity (see Supporting Information for the posterior distributions of individual parameters; S4–S7 Figs).

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 3. Model comparison results and posterior distribution of the group-level parameters of the best-fitting model (N = 49). (A) Relative LOOIC difference indicates the difference in LOOIC between the best-fitting model and each of the other models. The best-fitting model was the full model, which assumed separate Pavlovian bias, learning rate, and irreducible noise in GNG and WMGNG tasks. Lower LOOIC indicates better model fit. (B) Posterior distributions of group-level parameters from the best-fitting model. Learning rate and irreducible noise estimates were credibly different in the GNG and WMGNG tasks, while Pavlovian bias estimates were not. Dots indicate medians and bars indicate 95% HDIs. Asterisks indicate that the 95% HDIs of the two parameters’ posterior distributions do not overlap (i.e., differences are credible). https://doi.org/10.1371/journal.pcbi.1011692.g003

To further compare choice randomness between the two tasks, we examined the extent to which choices were dependent on value discrepancies between the two options. We first plotted the percentage of go choices for the GNG and WMGNG tasks by varying the quantiles of differences in action weight between the “go” and “no-go” actions (W go —W nogo ) (Fig 4A). The trial-by-trial action weights were extracted from the best-fitting model. Higher quantiles corresponded to a greater “go” action weight than “no-go” action weight. Overall, the go ratio increased from the first to the tenth quantile, indicating that the value differences between the “go” and “no-go” actions affected participants’ choices. This result further illustrates the difference between the two tasks: the increase in the go ratio was steeper in the GNG task than in the WMGNG task. In particular, the go ratio significantly differed between the two tasks for the first (t(48) = -3.59, p = 0.001, d = 0.51), second (t(48) = -3.23, p = 0.002, d = 0.46), third (t(48) = -2.55, p = 0.014, d = 0.36), eighth (t(48) = 2.95, p = 0.005, d = 0.42), and tenth (t(48) = 2.76, p = 0.008, d = 0.39) quantiles. Thus, under WM load, participants were less sensitive to the significant value difference between “go” and “no-go”.

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 4. Choice randomness (N = 49). (A) Mean percentage of go choices for different quantiles of action weight differences (Wgo—W nogo ) between “go” and “no-go” choices, where higher quantiles indicate higher decision values for “go” choices. Under WM load, the increase in go ratio according to quantile was less steep. (B) Mean accuracies for different quantiles of absolute value differences (|Wgo—W nogo |), where higher quantiles indicate larger value differences between two options or easier choices. Under WM load, the increase in accuracy according to quantile was less steep. (A)-(B) Dots are group means, and error bars are means ± standard errors of the means. Asterisks show the results of pairwise t-tests. **** p < 0.0001, *** p < 0.001, ** p < 0.01, * p < 0.05. https://doi.org/10.1371/journal.pcbi.1011692.g004

To compare these patterns in a different way and further explore the extent to which performance was dependent on choice difficulty, we plotted accuracies for the two tasks and for different quantiles of the absolute value differences (|Wgo—W nogo |; Fig 4B). We assumed that the choices would become easier when the absolute value difference was increased because a small value difference makes it difficult to choose between options. Overall, the accuracy increased from the first to the tenth quantile, indicating that participants performed better as the choices became easier. This result further illustrates the difference between the two tasks: the increase in accuracy was steeper in the GNG task than in the WMGNG task. Specifically, the accuracy significantly differed between the two tasks for the fifth (t(48) = 4.12, p<0.001, d = 0.59), sixth (t(48) = 2.95, p = 0.005, d = 0.42), seventh (t(48) = 2.44, p = 0.018, d = 0.35), eighth (t(48) = 3.13, p = 0.003, d = 0.45), ninth (t(48) = 2.87, p = 0.006, d = 0.41), and tenth (t(48) = 2.55, p = 0.014, d = 0.36) quantiles. Thus, participants performed worse in the WM load condition than in the control condition when choices were easier. Overall, Fig 4 demonstrates that WM load reduced the effect of the value difference on participants, indicating increased choice randomness.

Next, we examined if our model predicts the observed decrease in task performance under WM load both in the Pavlovian-congruent and Pavlovian-incongruent conditions. S1B Fig shows that our model indeed predicts the lower task performance in both types of conditions. This result is in line with our result that choice randomness increased under WM load. We can expect that an increase in randomness would result in a lower accuracy unless the accuracy was below the chance level in the first place. While the accuracy decrease could also be associated with the lower learning rate under WM load, we need to be cautious in this interpretation because lower learning rates might instead increase the accuracy by making learning more robust against noise.

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011692

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/