(C) PLOS One

(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .

Attention promotes the neural encoding of prediction errors [1]

['Cooper A. Smout', 'Queensland Brain Institute', 'University Of Queensland', 'Brisbane', 'Australian Research Council Centre Of Excellence For Integrative Brain Function', 'Victoria', 'Matthew F. Tang', 'Marta I. Garrido', 'School Of Mathematics', 'Physics']

Date: 2023-06

Abstract The encoding of sensory information in the human brain is thought to be optimised by two principal processes: ‘prediction’ uses stored information to guide the interpretation of forthcoming sensory events, and ‘attention’ prioritizes these events according to their behavioural relevance. Despite the ubiquitous contributions of attention and prediction to various aspects of perception and cognition, it remains unknown how they interact to modulate information processing in the brain. A recent extension of predictive coding theory suggests that attention optimises the expected precision of predictions by modulating the synaptic gain of prediction error units. Because prediction errors code for the difference between predictions and sensory signals, this model would suggest that attention increases the selectivity for mismatch information in the neural response to a surprising stimulus. Alternative predictive coding models propose that attention increases the activity of prediction (or ‘representation’) neurons and would therefore suggest that attention and prediction synergistically modulate selectivity for ‘feature information’ in the brain. Here, we applied forward encoding models to neural activity recorded via electroencephalography (EEG) as human observers performed a simple visual task to test for the effect of attention on both mismatch and feature information in the neural response to surprising stimuli. Participants attended or ignored a periodic stream of gratings, the orientations of which could be either predictable, surprising, or unpredictable. We found that surprising stimuli evoked neural responses that were encoded according to the difference between predicted and observed stimulus features, and that attention facilitated the encoding of this type of information in the brain. These findings advance our understanding of how attention and prediction modulate information processing in the brain, as well as support the theory that attention optimises precision expectations during hierarchical inference by increasing the gain of prediction errors.

Author summary The human brain is theorised to operate like a sophisticated hypothesis tester, using past experience to generate a model of the external world, testing predictions of this model against incoming sensory evidence, and generating a ‘prediction error’ signal that updates the model when predictions and sensory evidence do not match. In addition to predicting the content of sensory signals, an optimal system should also predict the reliability (or ‘precision’) of those signals to minimise the influence of unreliable sensory information. It has been proposed that attention optimises this process by boosting prediction error signals, which are coded as the difference (or ‘mismatch’) between predicted and observed stimulus features. Accordingly, this theory predicts that attention should increase the selectivity for mismatch information in the neural response to surprising stimuli. We tested this hypothesis in human participants by training a decoding algorithm to identify ‘mismatch information’ in the brain, recorded by electroencephalography (EEG), following the presentation of surprising stimuli that were either attended or ignored. We found that attention did indeed increase the selectivity for mismatch information in the neural response, supporting the notion that attention and prediction are intricately related processes.

Citation: Smout CA, Tang MF, Garrido MI, Mattingley JB (2019) Attention promotes the neural encoding of prediction errors. PLoS Biol 17(2): e2006812. https://doi.org/10.1371/journal.pbio.2006812 Academic Editor: Ben Seymour, University of Cambridge, United Kingdom of Great Britain and Northern Ireland Received: May 29, 2018; Accepted: February 5, 2019; Published: February 27, 2019 Copyright: © 2019 Smout et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All data and scripts used in this study are available from the Open Science Framework (https://doi.org/10.17605/OSF.IO/A3PFQ). Funding: Australian Research Council Centre of Excellence for Integrative Brain Function https://www.cibf.edu.au/ (grant number CE140100007). Received by JBM and MIG. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Australian Research Council Australian Laureate Fellowship http://www.arc.gov.au/australian-laureate-fellowships (grant number FL110100103). Received by JBM. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. University of Queensland Fellowship https://research.uq.edu.au/research-support/research-management/funding-schemes/uq-internal-initiatives/uq-development-fellowships (grant number 2016000071). Received by MIG. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Abbreviations: a.u., arbitrary units; BOLD, blood oxygen level dependent; EEG, electroencephalography; ERP, event-related potential; fMRI, functional MRI; ICA, independent components analysis; MMR, mismatch response; RSS, residual sum of squares

Introduction Perception is believed to arise from a process of active inference [1], during which the brain retrieves information from past experiences to build predictive models of likely future occurrences and compares these predictions with incoming sensory evidence [2,3]. In support of the idea that prediction increases the efficiency of neural encoding, previous studies have demonstrated that predicted visual events typically evoke smaller neural responses than surprising events (e.g., evoked activity measured in terms of changes in electrical potential or blood oxygen level dependent [BOLD] response; for a review, see [4]). Recent studies have shown that selective attention can increase [5] or reverse [6] the suppressive effect of prediction on neural activity, suggesting that attention and prediction facilitate perception [7] via synergistic modulation of bottom-up sensory signals [8–11]. It remains unclear, however, what type of information is modulated in the interaction between attention and prediction. This question is important because different predictive coding models make distinct predictions about how information is transmitted through the cortical hierarchy [3,8,12,13]. Here, we used forward encoding models to assess selectivity for two distinct types of information in the neural response to surprising stimuli—feature and mismatch information—and to test the effect of attention on these two informational codes. A prominent version of predictive coding theory claims that top-down prediction signals ‘cancel out’ bottom-up sensory signals that match the predicted content, leaving only the remaining prediction error to propagate forward and update a model of the sensory environment [2,8,9]. Because error propagation is thought to be associated with superficial pyramidal cells [9], and these cells are thought to be primarily responsible for generating EEG signals [14,15], this theory predicts that surprising events will increase the selectivity of EEG responses to the difference between predicted and observed stimulus features, i.e., mismatch information. Furthermore, a recent extension of this theory suggests that selective attention optimises the expected precision of predictions by modulating the synaptic gain (postsynaptic responsiveness) of prediction error units [8]—i.e., neurons coding for behaviourally relevant prediction errors should be more responsive than those coding for irrelevant prediction errors. On this account, attention should further increase selectivity for mismatch information in the neural response to surprising stimuli relative to unsurprising stimuli. Here, we call this account the ‘mismatch information model’. Alternative predictive coding models [12,13,16] propose that predictions—as opposed to prediction errors—are propagated forward through the visual hierarchy, and it is these prediction signals that are modulated by attention. For example, the model proposed by Spratling [12] simulates the common physiological finding that attention to a stimulus enhances the firing rate of neurons tuned to specific stimulus features (e.g., orientation or colour for visual neurons) and has been shown to be mathematically equivalent to the biased competition model of attention [17–20]. In line with these alternative models, we investigated a second hypothesis—here termed the ‘feature information model’—which proposes that the interaction between attention and prediction at the level of neural responses is driven by changes in feature-specific information in the brain. Here, we tested whether the feature information model or the mismatch information model provides a better account of the neural coding of surprising stimuli in the human brain and examined the influence of selective attention on each of these two neural codes. Participants attended to, or ignored, periodic streams of visual gratings, the orientations of which were either predictable, surprising, or unpredictable. We applied forward encoding models to whole-brain neural activity measured using EEG to quantify the neural selectivity for information related to the grating orientation and the mismatch between the predicted and observed grating orientations. We show that surprising stimuli evoke neural responses that contain information related to the difference between predicted and observed stimulus features, consistent with the mismatch information model. Crucially, we also find that attention increases the selectivity for mismatch information in the neural response to surprising stimuli, supporting the hypothesis that attention increases the gain of prediction errors [8].

Discussion Here we set out to determine what type of information is modulated in the interaction between attention and prediction [8]. To achieve this, we used forward encoding models of EEG data to quantify the selectivity for orientation and mismatch information in the neural responses to surprising and unpredictable stimuli in the well-established roving oddball paradigm [21,37]. Relative to unpredictable stimuli (controls), we found that EEG responses to surprising stimuli (deviants) were equally selective for orientation information, but more selective for information related to the difference between predicted and observed stimulus features. These results are consistent with the mismatch information model and support the idea that top-down prediction signals ‘cancel out’ matching bottom-up sensory signals and leave only the remaining prediction error to propagate forward [2,3,8,9]. Crucially, we also found that attention increased the selectivity for mismatch information in neural responses to surprising but not control stimuli. This finding demonstrates that attention boosts mismatch information evoked by surprising stimuli (putative prediction errors) and is consistent with a recent version of predictive coding theory that proposes attention optimises the expected precision of predictions by increasing the gain of prediction errors [8]. We found no difference between orientation response profiles evoked by surprising and unpredictable stimuli (a prediction of the feature information model), suggesting that the increase in EEG activity that is typically observed with surprise is not coded according to stimulus features. This finding contradicts predictive coding models in which predictions (or ‘representations’) of stimulus features are passed up the visual hierarchy [12,16,17]. Because feedforward connections largely originate primarily from superficial pyramidal cells and it is this activity that is measured with EEG [9,14,15], these models would predict that surprise changes the feature selectivity of EEG responses: a finding we do not observe here. This finding might also seem to contradict a recent study that demonstrated greater selectivity for orientation information in early visual cortex BOLD activity following presentation of a predicted grating, relative to a surprising grating [38]. Since BOLD activity indirectly measures the activity patterns of heterogenous populations of neurons, however, this change in feature selectivity could have reflected a change in either of the two neuronal populations proposed to underlie predictive coding—predictions or prediction errors. The latter interpretation is inconsistent with the results of the present study, which suggests that prediction errors are encoded according to the mismatch between predicted and observed stimulus features, and not the features themselves. The former interpretation (i.e., that predictions are coded according to the stimulus features) fits well with a recent study that showed prediction induces feature-specific templates immediately prior to stimulus onset [31]. Thus, a parsimonious account of the literature to date suggests that predictions and prediction errors are represented in the brain via distinct neural codes: whereas predictions are represented according to stimulus features, prediction errors are represented according to the mismatch between predicted and observed stimulus features. In a recent study by our group [39], we observed a decrease in orientation selectivity in the neural response to predicted stimuli, relative to surprising stimuli, shortly after stimulus onset (79–185 ms). Here, we observed a similar (but nonsignificant) trend in the same direction (standards < deviants) at approximately the same time (94–145 ms, S2C Fig, cluster not shown). Close inspection of the present results, however, suggests that some orientation information evoked by the previous standard was still present in the brain at the onset of the subsequent standard (indicated by the above-zero amplitude of the orientation response to standards at stimulus onset, t = 0 ms, S2C Fig), which may have obscured detection of the early effect reported in Tang and colleagues [39]. The present results revealed a late effect of prediction (standards < deviants, 324–550 ms, S2C and S2D Fig) that was not observed in our previous work [39]. Since a critical difference between the two studies was the number of times identical stimuli could be presented consecutively (no more than twice in the previous study), we speculate that the late effect observed here might reflect the minimal amount of model updating required after the presentation of a precisely predicted stimulus. We also found that attention increased the amplitude of orientation response profiles (Fig 3A and 3B), consistent with previous studies that applied forward encoding models to human functional MRI (fMRI) [34,40] and time-frequency-resolved EEG data [29]. The present study replicates and extends these studies with the application of forward encoding models to time-resolved EEG recordings (resulting in <30 ms temporal resolution after smoothing), demonstrating that attention increases feature selectivity in the human brain from approximately 200 ms after stimulus onset. Crucially, we also tested the interactive effects of attention and prediction on information processing in the brain. There was a large and significant effect of attention on mismatch response profiles in response to surprising but not unpredictable stimuli (beginning around 150 ms after stimulus onset and reaching significance from about 350 ms). This finding demonstrates that attention boosts mismatch prediction errors evoked by surprising stimuli and is consistent with a recent iteration of predictive coding theory according to which attention optimises the expected precision of prediction errors [8]. Previous studies have found evidence for an interaction between attention and prediction in both the auditory [5] and visual [6,41] modalities. These studies used activation-based analyses to compare differences between predicted and unpredicted stimuli at the level of overall neural activity but did not investigate what type of information is modulated in the interaction between attention and prediction. In contrast, the present study used information-based analyses [42] to identify specific patterns of neural activity that are associated with orientation-mismatch information in the brain, and showed that selectivity for this type of information (but not feature information) is increased with attention. Thus, the present study provides clear support for the hypothesis that attention boosts the gain of prediction errors [8]. It will be important for future research to investigate whether the interactive effects of attention and prediction on mismatch information is contingent on the type of attention (e.g., feature-based versus spatial attention) or prediction (e.g., rule-based versus multimodal cue-stimulus predictions; [31,43]). We found that the magnitude of mismatch response profiles correlated with the number of preceding standards (Fig 4A and 4B). Previous work in the auditory domain demonstrated that successive repetitions of the standard evoke progressively increased responses to a subsequent attended deviant [35]. Here, we find a corollary for this effect in the visual domain and demonstrate that the neural activity modulated by the number of preceding standards is likely encoded as mismatch information. This finding is also consistent with the notion that repeating the standard allows a more precise prediction to be generated, which results in a larger prediction error to a subsequent surprising stimulus [44]. We also found that mismatch response profiles increased with the magnitude of the mismatch between predicted and observed stimulus features (Fig 4C). Previous work in the auditory domain has demonstrated a correlation between deviation magnitude and the amplitude of the neural response to deviants (i.e., the mismatch negativity) [45]. Here, we demonstrate a relationship between deviation magnitude and selectivity for mismatch information (as opposed to activation levels) in the visual domain, suggesting that the magnitude of mismatch information might be used by the brain to guide updating of the predictive model. Since the present study investigated mismatch signals with respect to a continuous and circular feature dimension (i.e., orientation), it will be important for future research to extend the current line of research to noncircular (e.g., luminance, auditory frequency) and categorical (e.g., facial emotions) feature dimensions. There was a lateral shift in the response profile of individual mismatch channels toward the orthogonal (90°) channel (Fig 4D). The extent of this effect depended on the deviation magnitude, with large deviations (±40°–80°) being predominantly stacked over the 90° channel and smaller deviations (±20°) being more closely aligned with their veridical mismatch angle (Fig 4D). We speculate that this might indicate a qualitative difference in the way that small and large prediction errors were treated by the brain in the present study. Small deviations may have resulted in updating and retention of the current model (via a near-veridical mismatch signal), whereas large deviations may have resulted in the wholesale rejection of the current model (via a generic mismatch signal) in favour of an alternative model that represents the deviant stimulus. In the latter case, the magnitude of the (orthogonal) mismatch channel response might represent an efficient code that the brain utilises to select from a number of likely alternative models. A number of recent studies failed to find an interaction between the effects of attention and prediction on stimulus information in the brain [31,38,46]. If predictions are encoded according to stimulus features, as we argue above, these null findings contradict the theory that attention boosts predictions [47]. In contrast, we show that prediction errors, represented according to the mismatch between predicted and observed stimulus features, are enhanced with attention. Although the present study cannot speak to the activity of single neurons, we note that the emerging picture is consistent with the notion that predictions and prediction errors are represented in distinct populations of neurons [2] that encode two distinct types of information and are differentially influenced by attention. Under this framework, feature information encoded by prediction units would be immune to attention, whereas mismatch information encoded by prediction error units would be enhanced by attention. Future research could test these hypotheses at the single-cell level, for example by using single-unit electrode recordings or 2-photon calcium imaging to assess whether different neurons within a given cortical area satisfy these constraints.

Methods Ethics statement The study was approved by The University of Queensland Human Research Ethics Committee (approval number: 2015001576) and was conducted in accordance with the Declaration of Helsinki. Participants provided informed written consent prior to commencement of the study. Participants Twenty-four healthy participants (11 female, 13 male, mean = 23.25 years, SD = 9.01 years, range: 18 to 64 years) with normal or corrected-to-normal vision were recruited via an online research participation scheme at The University of Queensland. Stimuli Stimuli were presented on a 61 cm LED monitor (Asus, VG248QE) with a 1,920 × 1,080 pixel resolution and refresh rate of 120 Hz, using the PsychToolbox presentation software [48] for Matlab (version 15b) running under Windows 7 with a NVidia Quadro K4000 graphics card. Participants were seated in a comfortable armchair in an electrically shielded laboratory, with the head supported by a chin rest at a viewing distance of 57 cm. During each block, 415 gratings with Gaussian edges (outer diameter: 11°; inner mask diameter: 0.83°; spatial frequency: 2.73 c/°; 100% contrast) were presented centrally for 100 ms with a 500 ms ISI. Grating orientations were evenly spaced between 0° (horizontal) and 160° (in 20° steps). Eighteen (18) gratings in each block (2 per orientation) were presented with a higher spatial frequency (range: 2.73–4.55 c/°, as per staircase procedure, below), with a gap of at least 1.5 seconds between any two such gratings. We used a modified de Bruijn sequence to balance the order of grating orientations across conditions, sessions, and participants. Specifically, we generated two 9-character (orientation) sequences without successive repetitions (e.g., ABCA, not ABCC)—one with a 3-character subsequence (504 characters long) and another with a 2-character subsequence (72 characters long)—and appended two copies of the former sequence to three copies of the latter sequence (1,224 characters in total). This master sequence was used to allocate the order of both deviants and controls in each session (using different, random start-points) and ensured that each orientation was preceded by equal numbers of all other orientations (up to 2+ preceding stimuli) so that decoding of any specific orientation could not be biased by the orientation of preceding stimuli. In roving oddball sequences, the number of Gabor repetitions (i.e., standards) was balanced across orientations within each session, such that each orientation repeated between 4 and 11 times according to the following distribution: (31, 31, 31, 23, 5, 5, 5, 5), respectively. During each block, the fixation dot (diameter: 0.3°, 100% contrast) decreased in contrast 18 times (contrast range: 53%–98% as per staircase procedure, below) for 0.5 seconds (0.25-second linear ramp on and off). Contrast decrement onsets were randomised separately for each block, with a gap of at least 1.5 seconds between any two decrement onsets. Procedure Participants attended two testing sessions of 60 minutes’ duration, approximately 1 week apart, and completed one of two tasks in each session (Fig 1, session order counterbalanced across participants). For the grating task, participants were informed that approximately 1 out of 20 of the gratings would be a target grating with a higher spatial frequency than nontargets and were asked to press a mouse button as quickly as possible when they detected a target grating; all other gratings were to be ignored. For the dot task, participants were informed that the fixation dot would occasionally decrease in contrast and were asked to press a mouse button as quickly as possible when they detected such a change. Participants initially completed three practice blocks (3.5 min per block) with auditory feedback (high or low tones) indicating missed targets and the accuracy of their responses. During practice blocks in the first testing session, target salience (spatial frequency or dot contrast change, depending on the task) was adjusted dynamically using a Quest staircase procedure [49] to approximate 75% target detection. During practice blocks in the second testing session, target salience was adjusted to approximate the same level of target detection observed in the first testing session. Participants were requested to minimise their number of false alarms. After the practice blocks, participants were fitted with an EEG cap (see ‘EEG data acquisition’) before completing a total of 21 test blocks (3 equiprobable, 18 roving standard, block order randomised) without auditory feedback. After each block, participants were shown the percentage of targets correctly detected, the speed of these responses, and how many nontargets were responded to (false alarms). Behavioural data analysis Participant responses were scored as hits if they occurred within 1 second of the onset of a target grating in the grating task, or within 1 second of the peak contrast decrement in the dot task. Target detection was then expressed as a percentage of the total number of targets presented in each testing session. One participant detected less than 50% of targets in both sessions and was removed from further analysis. Target detections and false alarms across the two sessions were compared with paired-samples t tests and Bayes Factors. Bayes factors allow for quantification of evidence in favour of either the null or alternative hypothesis, with B 01 > 3 indicating substantial support for the alternative hypothesis and B 01 < 0.33 indicating substantial support for the null hypothesis [50]. Bayes factors were computed using the Dienes [50,51] calculator in Matlab, with uniform priors for target detection (lower bound: −25%; upper bound: 25%) and false alarms (lower bound: −50; upper bound: 50). EEG data acquisition Participants were fitted with a 64 Ag-AgCl electrode EEG system (BioSemi Active Two: Amsterdam, the Netherlands). Continuous data were recorded using BioSemi ActiView software (http://www.biosemi.com) and were digitized at a sample rate of 1,024 Hz with 24-bit A/D conversion and a 0.01–208 Hz amplifier band pass. All scalp electrode offsets were adjusted to below 20 μV prior to beginning the recording. Pairs of flat Ag-AgCl electro-oculographic electrodes were placed on the outside of both eyes, and above and below the left eye, to record horizontal and vertical eye movements, respectively. EEG data preprocessing EEG recordings were processed offline using the EEGlab toolbox in Matlab [23]. Data were resampled to 256 Hz and high-pass filtered with a passband edge at 0.5 Hz (1691-point Hamming window, cut-off frequency: 0.25 Hz, −6 db). Raw data were inspected for the presence of faulty scalp electrodes (2 electrodes, across 2 sessions), which were interpolated using the average of the neighbouring activations (neighbours defined according to the EEGlab Biosemi 64 template). Data were re-referenced to the average of all scalp electrodes, and line noise at 50 and 100 Hz was removed using the Cleanline plugin for EEGlab (https://www.nitrc.org/projects/cleanline). Continuous data were visually inspected, and periods of noise (e.g., muscle activity) were removed (1.4% of data removed in this way, across sessions). For artefact identification, the cleaned data were segmented into 500 ms epochs surrounding grating onsets (100 ms pre- and 400 ms post-stimulus). Improbable epochs were removed using a probability test (6 SD for individual electrode channels, 2 SD for all electrode channels, 6.5% of trials across sessions), and the remaining data were subjected to independent components analyses (ICAs) with a reduced rank in cases of a missing EOG electrode (2 sessions) or an interpolated scalp electrode (2 sessions). Components representing blinks, saccades, and muscle artefacts were identified using the SASICA plugin for EEGlab [52]. For further analysis, the cleaned data (i.e., prior to the ICA analysis) were segmented into 800 ms epochs surrounding grating onsets (150 ms pre- and 650 ms post-stimulus). Independent component weights from the artefact identification process were applied to this new data set, and previously identified artefactual components were removed. Baseline activity in the 100 ms prior to each stimulus was removed from each epoch. Grating epochs were then separated into their respective attention and prediction conditions. Epochs in the grating task were labelled as ‘Attended’ and epochs in the dot task were labelled as ‘Ignored’. Epochs in the roving oddball sequence were labelled as ‘Deviants’ when they contained the first stimulus in a repeated train of gratings and ‘Standards’ when they contained a grating that had been repeated between 5 and 7 times. Epochs in the equiprobable sequence were labelled as ‘Controls’. ERP analyses Trials in each attention and prediction condition were averaged within participants to produce ERPs for each individual. The effect of attention was assessed using a two-tailed cluster-based permutation test across participant ERPs (Monte-Carlo distribution with 5,000 permutations, p cluster < 0.05; sample statistic: dependent samples t statistic, aggregated using the maximum sum of significant adjacent samples, p sample < 0.05). Because there were 3, rather than 2, levels of prediction, we tested the effect of prediction with a cluster-based permutation test that used f-statistics at the sample level and a one-sided distribution to account for the positive range of f-statistics (Monte-Carlo distribution with 5,000 permutations, p cluster < 0.05; sample statistic: dependent samples f-statistic, aggregated using the maximum sum of significant adjacent samples, p sample < 0.05). Simple contrasts between prediction conditions (deviants versus standards, and deviants versus controls) were tested using two-tailed cluster-based permutation tests (with the same settings as used to investigate attention). The interaction between attention and prediction was assessed by subtracting the ignored ERP from the attended ERP within each prediction condition and subjecting the resulting difference waves to a one-tailed cluster-based permutation test across participant ERPs (Monte-Carlo distribution with 5,000 permutations, p cluster < 0.05; sample statistic: dependent samples f-statistic, aggregated using the maximum sum of significant adjacent samples, p sample < 0.05). The interaction effect was followed up by comparing difference waves (attended minus ignored) between deviants and standards, and between deviants and controls (two-tailed cluster-based permutation tests, same settings as above). Forward encoding models To investigate the informational content of orientation signals, we used a forward encoding model [29,53] designed to control for noise covariance in highly correlated data [31,54] (https://github.com/Pim-Mostert/decoding-toolbox), such as EEG. We modelled an idealised basis set of the 9 orientations of interest (0°–160° in 20° steps) with nine half-wave rectified cosine functions raised to the 8th power, such that the response profile associated with any particular orientation in the 180° space could be equally expressed as a weighted sum of the nine modelled orientation channels [29]. We created a matrix of nine regressors that represented the grating orientation presented on each trial in the training set (1 = the presented orientation; 0 = otherwise) and convolved this regressor matrix with the basis set to produce a design matrix, C (9 orientation channels × n trials). The EEG data could thus be described by the linear model: such that B represents the data (64 electrodes × n trials), W represents a spatial weight matrix that converts activity in channel space to activity in electrode space (64 electrodes × 9 orientation channels), and N represents the residuals (i.e., noise). To train and test the forward encoding model, we used a 3-fold cross-validation procedure that was iterated 100 times to increase reliability of the results. Within each cross-validation iteration, the experimental blocks were folded into thirds: one-third of trials served as the test set, and the remaining two-thirds served as the training set, and folds were looped through until each fold had served as a test set. Across successive iterations of the cross-validation procedure, the number of trials in each condition was balanced within folds by random selection (on the first iteration) or by selecting the trials that had been utilised the least across previous folds (subsequent iterations). Prior to estimating the forward encoding model, each electrode in the training data was de-meaned across trials, and each time point was averaged across a 27.3 ms window centred on the time point of interest (corresponding to an a priori window of 30 ms, rounded down to an odd number of samples to prevent asymmetric centring). Separately for each time point and orientation channel of interest, i, we solved the linear equation using least square regression: such that w i represents the spatial weights for channel i, B train represents the training data (64 electrodes × n train trials), and c train,i represents the hypothetical response of channel i across the training trials (1 × n train trials). Following Mostert and colleagues [54], we then derived the optimal spatial filter v i to recover the activity of the ith orientation channel: such that Σ i is the regularized covariance matrix for channel i, estimated as follows: such that n train is the number of training trials. The covariance matrix was regularized by using the analytically determined shrinkage parameter [31]. Combining the spatial filters across each of the nine orientation channels produced a channel filter matrix V (64 electrodes × 9 channels). such that B test represents the test data at the time point of interest (64 electrodes × n test trials), averaged over a 27.3 ms window (as per the training data). Finally, the orientation channel responses for each trial were circularly shifted to centre the presented orientation on 0°, and the zero-centred responses were averaged across trials within each condition to produce the condition-average orientation channel response (Fig 3B). To assess information related to the mismatch between predicted and observed stimulus features (Fig 3D and 3E), we computed a second forward encoding model as above, with the exception that now the regression matrix represented the difference between the current grating orientation (deviant or control) and the previous grating orientation (standard or control, respectively). That is, a grating at 60° orientation that followed a grating at 20° orientation would be coded as 40° (current minus previous orientation). To assess the dynamic nature of mismatch response profiles (Fig 5), we trained the weight matrix, W, at a single time point in the training set, B 1 (using a 27.3 ms sliding window) and then applied the weights to every third time point in the test set, B 2 (using a 27.3 ms sliding window). This process was repeated for every third time point in the training set, resulting in a three-dimensional matrix that contained the population response profile at each cross-generalised time point (9 orientations × 66 training time points × 66 testing time points). Quantifying channel responses Previous studies have utilised a number of different methods to quantify the selectivity of neural response profiles [30,31]. Because we were interested in characterising the properties of neural response profiles, we opted to fit an exponentiated cosine function to the modelled data [33,34] using least square regression: such that y is the predicted orientation channel activity in response to a grating with orientation x; A is the peak response amplitude, ҡ is the concentration parameter, μ is the centre of the distribution, and B is the baseline offset. Fitting was performed using the nonlinear least square method in Matlab (trust region reflective algorithm). The free parameters A, ҡ, and B were constrained to the ranges (−0.5, 2), (1.5, 200), and (−1.0, 0.5), respectively, and initiated with the values 0.5, 2, and 0, respectively. The free parameter μ was constrained to be zero when quantifying mean-centred orientation or mismatch response profiles (which should be centred on zero, Figs 3, 4A and 4B). When quantifying individual (uncentred) mismatch channel response profiles (Fig 4C and 4D), the free parameter μ was allowed to vary between −90° and 90°. To reduce the likelihood of spurious (inverted) fits, the parameter search was initiated with a μ value centred on the channel with the largest response. The main effects of attention and prediction on orientation or mismatch response profiles were assessed with cluster-based permutation tests across participant parameters (amplitude, concentration). The interaction effects (between attention and prediction) on orientation and mismatch response profiles were assessed by first subtracting the ignored response from the attended response and then subjecting the resulting difference maps to cluster-based permutation tests. In cases where two levels were compared (i.e., the main effect of attention on orientation response profiles, and all effects on mismatch response profiles), we used two-tailed cluster-based permutation tests across participant parameters (Monte-Carlo distribution with 5,000 permutations, p cluster < 0.05; sample statistic: dependent samples t statistic, aggregated using the maximum sum of significant adjacent samples, p sample < 0.05). In cases where three levels were compared (i.e., the main effect of prediction and the interaction effect on orientation response profiles), we used one-tailed cluster-based permutation tests across participant parameters (Monte-Carlo distribution with 5,000 permutations, p cluster < 0.05; sample statistic: dependent samples f-statistic, aggregated using the maximum sum of significant adjacent samples, p sample < 0.05) and followed up any significant effects by collapsing across significant timepoints and comparing individual conditions with paired-samples t tests and Bayes Factors (uniform prior, lower bound: −0.3 a.u., upper bound: 0.3 a.u.). Univariate electrode sensitivity To determine which electrodes were most informative for the forward encoding analyses, we tested the sensitivity of each electrode to both orientation and mismatch information (Fig 3C and 3F). The baseline-corrected signal at each electrode and time point in the epoch was regressed against a design matrix that consisted of the sine and cosine of the variable of interest (orientation or mismatch), and a constant regressor [30]. We calculated sensitivity, S, using the square of the sine (β SIN ) and cosine (β COS ) regression coefficients: S was normalised against a null distribution of the values expected by chance. The null distribution was computed by shuffling the design matrix and repeating the analysis 1,000 times. The observed (unpermuted) sensitivity index was ranked within the null distribution (to produce a p-value) and z-normalised using the inverse of the cumulative Gaussian distribution (μ = 0; σ = 1). The topographies shown in Fig 3C and 3F reflect the group averaged z-scores, averaged across each time period of interest.

Supporting information S1 Fig. ERPs and MMRs. (A) ERPs at selected electrodes, shown separately for each condition. Bars underneath each plot indicate time points at which there was a significant main effect of attention (solid grey bar), significant main effect of prediction (solid black bar), or a significant interaction between attention and prediction (dotted black bar) at the plotted electrode. (B) Classic MMR (deviants minus standards) and genuine MMR (deviants minus controls) at selected electrodes, plotted separately for each level of attention. Green and yellow lines denote the classic MMR and genuine MMR, respectively; solid and dashed lines denote attended and ignored stimuli, respectively. Bars underneath each plot indicate timepoints at which there was a significant MMR in the corresponding condition, at the plotted electrode. Attended deviants were significantly different from attended standards (39–504 ms, cluster-corrected p < 0.001) and attended controls (172–550 ms, cluster-corrected p < 0.001). Ignored deviants were significantly different from ignored standards (47–438 ms, cluster-corrected p < 0.001) and ignored controls (285–461 ms, cluster-corrected p = 0.001) (C–I) Topographies of effects collapsed across time points between 200 and 300 ms. Asterisks and dots denote electrodes with larger or smaller responses, respectively, in at least 25% of the displayed time points. (C) Main effect of attention (attended minus ignored). (D) Classic MMR (deviants minus standards). (E) Genuine MMR (deviants minus controls). (F) Classic MMR during the grating task (attended deviants minus attended standards). (G) Classic MMR during the dot task (ignored deviants minus ignored standards). (H) Genuine MMR during the grating task (attended deviants minus attended controls). (G) Genuine MMR during the dot task (ignored deviants minus ignored standards). ERP, event-related potential; MMR, mismatch response. https://doi.org/10.1371/journal.pbio.2006812.s001 (TIF) S2 Fig. Independent main effects of attention and prediction on orientation response profiles, showing standards, deviants, and controls. (A) Main effect of attention on orientation response profiles. The amplitude of attended gratings was larger than that of ignored gratings (219–550 ms, cluster-corrected p = 0.001). Shading denotes standard error of the mean. The black bar along the x-axis denotes significant time points. (B) Orientation response profiles, collapsed across significant time points in A. Dots show activation in each of the nine modelled orientation channels. Curved lines show the functions used to quantify the amplitude and concentration of orientation-tuned responses (fitted to grand average data for illustrative purposes). (C) Main effect of prediction on orientation response profiles (black bar along the x-axis denotes significant time points, 324–550 ms, cluster-corrected p < 0.001). The amplitude of standards was reduced relative to both deviants and controls. (D) Orientation response profiles, collapsed across significant time points in C. (E) Interaction between attention and prediction on orientation response profile amplitude. Time-courses show the effect of attention (attended minus ignored) on each stimulus type. (F) Orientation response profiles, collapsed across time points in the nonsignificant but trending cluster in E (414–481 ms, not displayed, cluster-corrected p = 0.093). https://doi.org/10.1371/journal.pbio.2006812.s002 (TIF) S3 Fig. RSS for exponentiated cosine functions fitted to generalised mismatch response profiles (Fig 5). Note the high RSS values along the x-axis beginning at 200 ms, indicating that the apparent generalisation of spatial maps trained at stimulus onset to later times in the epoch (Fig 5, red patch along the x-axis) was likely due to noise. RSS, residual sum of squares. https://doi.org/10.1371/journal.pbio.2006812.s003 (TIF)

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2006812

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/