Since we are using random outputs, the classification ability depends only on the structure of the input. This matrix can be written as In our case, generalization is the ability to respond in the same way to multiple noisy variations of the inputs. The reason for the failure of the formula in the ultra sparse case is that γ and σ2 are actually approximations of Γ and Σ2—quantities that are not directly accessible experimentally (see Materials and Methods). 2A), each of which can be inactive (−1) or active (1). The challenge of integrating sources of information in the presence of noise. 0000111547 00000 n In particular, γ expresses the deviation from a linear transformation. (2005) a cell was considered to be selective to a particular stimulus if the response was at least five standard deviations above the baseline. Estimates of the sparseness of neural representations recorded in the brain vary over a wide range, depending on the method for defining the coding level, the sensory stimuli used, and the brain area considered. In the limit of very large neural systems (red lines), the classification error would keep increasing as the representations become sparser, and this would compensate for the decrease in the ability to discriminate. Equation 13 provides a recipe for estimating γ from experimental data. In particular, the performance depends on the input correlations, which in our case are the correlations between the vectors representing the input patterns of activity. It is important to note that for simplicity we analyzed the case in which all combinations of input patterns are considered. Our results probably have important implications for the dynamics of models that rely on RCNs to expand the dimensionality of the input, even outside the feedforward realm. In our model, f is controlled by varying the threshold for the activation of the RCNs. Drew, L.F. Abbott, Cascade models of synaptically stored memories, Neuron, 45, 599-611, (2005). 8C,D). In general the answer depends on several factors, which include the statistics of the quantity that is represented, the task to be executed, and the neural readout that utilizes the representation. 2A). As stated above, it is useful to expand the pattern indices into their constituents: μ = (x, a), ν = (y, b). Increasing the level of mixed selectivity in . Σ2 is the average noise of the patterns in the direction perpendicular to the decision hyperplane. These types of shifts have been studied systematically in experiments aimed at understanding the role of the dentate gyrus (DG) and CA3 in pattern separation and pattern completion (Sahay et al., 2011). The optimal coding level hardly shifts, but the relative advantage of being at the minimum increases when noise increases. Figure 4E shows the two-dimensional input distribution to all RCNs for two inputs that share the same value for one of the two information sources (i.e., half of all input neurons are the same). Indeed, it scales as m, whereas p grows like m2. As illustrated in Figure 1D, the transformation of the inputs performed by the RCN layer has to decorrelate them sufficiently to ensure linear separability while maintaining the representation of different versions of the same input similar enough to allow for generalization. The crosses and circles represent the patterns to be classified, as in Figure 1D. This result is especially important in cases where NRCN and p are unknown but fixed—for instance, when a neuromodulator changes the activity level of the network. Learning in neural networks with material synapses, Neural activity in the primate prefrontal cortex during associative learning. 0000009947 00000 n 0000152625 00000 n 0000027993 00000 n However, it indicates that our study could be relevant also in the case in which the neurons of the hidden layer are highly plastic. Although distances are distorted, their ranking is preserved (e.g., small distances map to small distances). In the future, it will be interesting to analyze specific tasks and determine to what extent our simple model can explain the observed consequences of the shifts in the balance between pattern separation and pattern completion. Intelligent behavior requires integrating several sources of information in a meaningful fashion—be it context with stimulus or shape with color and size. The colored lines denote the values used in A. In a similar manner, (Δ2)2 is the average squared difference between the firing rates corresponding to the cases in which both sources are different (e.g., different sensory stimuli appearing in different contexts, right cluster of arrows). Nevertheless, as an approximation we consider the minimal eigenvalue of the average matrix M̄ = 〈M〉ξ. We approximate the average test error by inserting the averages inside the nonlinearity (which is somewhat reasonable given that we are interested in errors far from saturation effects): The number of neurons representing each source, N = 500, is significantly larger than the number of states. Figure 5A shows numerical results of the fraction of errors made by a linear readout from a population of RCNs when the activity of a random 5% (light) or 17.5% (dark) of the source neurons is randomly flipped in each presentation. The original segregated representations are nonlinearly separable for a classification problem analogous to the exclusive OR (opposite points should produce the same output). 0000017993 00000 n Inspired by the results on the effects of the dimensional expansion performed in support vector machines (Cortes and Vapnik, 1995), many researchers realized that recurrent networks with randomly connected neurons can generate surprisingly rich dynamics and perform complex tasks even when the readout is just linear (Jaeger, 2001; Maass et al., 2002; Buonomano and Maass, 2009; Sussillo and Abbott, 2009; Rigotti et al., 2010b). It is analogous to our definition of pattern discrimination, suggesting that the neurons in the DG may have similar properties as our RCNs. Because neurons within each source only encode that specific source, the dimensionality is always smaller than p (see Materials and Methods). This is often the only viable approach when it is not known how the representations are used or read out by downstream structures (e.g., in the case of early sensory areas). There are situations, as those studied in recurrent networks (Bourjaily and Miller, 2011), in which different forms of synaptic plasticity can lead either to beneficial or disruptive effects. The mean firing rates for each combination of input sources are calculated (bars below the rasters). The difficulty stems from the high correlations between the input patterns that only differ by the state of one segregated information source (e.g., the one encoding the context). 0000005819 00000 n A, B, Comparing the actual test error with the one predicted by Equation 20. 0000004885 00000 n 0000092883 00000 n The dimensionality is more formally defined as the rank of the matrix that contains all the vectors that represent the p = m2 different inputs (see Materials and Methods). Some others must be implemented at the level of the soma and expressed by the recordable firing rates, as mixed selectivity is observed in extracellular recordings. 0000151972 00000 n The noise added to the patterns of activity is independent for each neuron, as in studies on generalization in attractor neural networks and pattern completion (see Discussion for more details). These are typical situations in almost every brain area, especially in those integrating inputs from multiple brain systems, such as the prefrontal cortex (Miller and Cohen, 2001). The decorrelation operated by the RCNs has the peculiarity that it not only increases the dissimilarity between inputs, but it also makes the neural representations linearly separable. We show that the threshold of the RCNs, which determines their coding level (i.e., the average fraction of stimuli to which each individual neuron responds), biases a tradeoff between generalization and discrimination. 0000018656 00000 n The output neuron (Figs. 530 0 obj <>stream First, there is accumulating experimental evidence that for some neural systems random connectivity is an important representational and computational substrate (Stettler and Axel, 2009). 0000002302 00000 n In the dense case in which the RCNs are activated by half of all possible inputs, if the number p of input patterns is sufficiently large, every RCN adds, on average, one dimension to the representations. Discriminating RCNs were defined as those that had a different activity level in response to two patterns differing by only one subpattern. We first consider generalization. For this reason, in what follows we will consider only the case in which all the combinations have to be classified. 0000152442 00000 n Previous work evaluated neural representations on the basis of the information they encode (Atick and Redlich, 1992; Jazayeri and Movshon, 2006). The matrices multiplied by γ1 and γ2 are both low rank and thus do not contribute to the minimal eigenvalue. To verify this, we considered m1 = m2 = 5 and a subset of p̃ = 1, …, 25 patterns. 0000006265 00000 n We also trained a readout using online learning from noisy inputs, and while the error decreased, the qualitative features we report did not differ. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci. showing that it is in fact only of rank 4. The spatial arrangement of the four points is a consequence of the correlations between all the patterns (e.g., AC has a large overlap or correlation with AD, as the first source is in the same state).

Cenovus Q2, Release It 2nd Edition, Wonder Wheel Coney Island Anniversary, Schlitz Beer, St Therese Sisters, Quiet Babies Intelligence, What Is The Average Salary For A Courier Driver, Golden Retriever Price,