A critical analysis of the conflict between Classical and Connectionist theories of Cognition.
Professor John Vervaeke
Date: November 26th, 1998
When a self-proclaimed radical eliminativist like Paul Churchland admits (however grudgingly) that there exist many “historical cases of successful inter-theoretic reduction,” (Churchland, 1988, p43) certainly we of more moderate nature should remain open to the proposal, in any context. Yet to an alarming degree, the key figures of the Connectionism/Classicism debate have operated notwithstanding such considerations, with each side claiming full title to the one true answer. The Classicists, lead by Fodor and Pylyshyn make sweeping claims about how “Connectionist cognitive architectures cannot … support productive cognitive capacities” (Fodor & Pylyshyn, 1988, p35). Nor is fault wholly theirs, with Connectionists like Ramsey, Stitch and Garon using Connectionism as a justification for the elimination of ‘folk psychology’ (see Ramsey et al, 1990). This antagonism, which I refer to as mutual eliminativism has created a deep trench in the surrounding fields of cognitive science, artificial intelligence and philosophy of mind, one that impairs our ability to progress in a meaningful way towards an adequate theory (and perhaps mechanisation) of thought.
The purpose of this paper is to examine the arguments made by each group, in light of a new hypothesis or perspective on the issue; specifically, with an eye to the possibility that the two theories are not mutually exclusive, but rather, complementary. This argument will be taken largely from the notion of levels of analysis that has so frequently emerged in the literature on this topic. First, however, each half of the mutual eliminativism argument must be elucidated, and refuted. It is important to understand that the goal of this treatment is not to determine the validity of Classicism, nor of Connectionism, but rather, to demonstrate that they are not mutually exclusive theories.
This argument, as previously observed, has been lead for the past several years by the work of Jerry Fodor and Zenon Pylyshyn. They propose a three part criticism of Connectionist networks, based largely on their (perceived) inability to conform with the Language of Thought Hypothesis put forward by Fodor ten years earlier (Fodor, 1976). The argument stems from the observation that thought, like natural language, seems to possess certain semantic properties, namely: productivity, systematicity, and further that thought, at least, seems also to possess inferential coherence (Fodor and Pylyshyn, 1988, p. 33).
The first of these, productivity, describes the observation that just as English sentences, for example, can express an unbounded number of concepts, (for a formal argument to this effect, see Chomsky, 1968) so too can thoughts represent an unbounded number of propositions. The argument that a Connectionist network cannot support the productivity attribute of thought is as follows:
(1) Sentences (or Thoughts) can be arbitrarily long, by being built up of recursively defined sub-units.
(2) (1) implies that there are arbitrarily many non-atomic expressions.
(3) For (2) to be valid, the mind must be composed of a symbol system, with semantic relationships between concepts, or symbols.
(4) Connectionist networks do not incorporate semantic relations between nodes representing different concepts, only causal relations. Thus they do not satisfy the criteria of (3) and therefore cannot be productive in nature. (Adapted from Fodor and Pylyshyn, 1988, pp. 33-36.)
The key to the argument’s undoing lies in the premise, stated but unproven, of (4). The premise (italicised) is actually only applicable to a subset of Connectionist networks, known as localist networks (for elaboration of the localist/distributed distinction, see Ramsey, Stitch and Garon, 1990, p. 510). Fodor and Pylyshyn acknowledge this, in fact, in a footnote:
To simplify the exposition, we assume a ‘localist’ approach, in which each semantically interpreted node corresponds to a single Connectionist unit; but nothing relevant to this discussion is changed if these nodes actually consist of patterns over a cluster of units. (Fodor and Pylyshyn, 1988, p15)
Therein lies the flaw of the argument. The argument is substantially different if we assume a truly distributed net instead of a localist one. In a localist network, where each node is labelled to correspond to a single semantic atom, it is entirely true that the only relation it can have to other nodes is a numerical (i.e. causal, non-semantic) one. In a distributed net however, varying activation levels of the same nodes, (not discrete clusters, as Fodor and Pylyshyn seem to think) can represent different semantic concepts, and furthermore, these states can be conceptually combined, subtracted from one another, or be modified by a third state, expressing a semantic relation between the two of them (for more on the combination of distributed states, see McClelland, Rumelhart, and Hinton, 1986, especially pp. 37-39). While their argument does indeed hold for a network where all semantic content is localised to individual nodes, it fails to generalise to Connectionist networks as a whole, and consequently fails as an argument against them.
The second argument, from systematicity and compositionality, has a very similar flavour. As Fodor and Pylyshyn put it,
What we mean when we say that linguistic capacities are systematic is that the ability to produce/understand some sentences is intrinsically connected to the ability to produce/understand certain others. (Fodor and Pylyshyn, 1988, p.37)
More prosaically, this refers to the observation reiterated by many of the Classicist faction, that the ability to understand the phrase “John loves Mary” intrinsically implies the ability to understand the phrase “Mary loves John.” What examinations of this type demonstrate, argue Fodor and Pylyshyn, is that thought is composed of semantically discrete concepts and relations between concepts. Further, that having a representation of a given semantic relation R implies that it can not only be used in the form aRb, but also in bRa, or cRd, as long as a, b, c, d, are all well-represented concepts in the mind. The contention as to Connectionism is that since relations between nodes are only represented though a numerical connection strength between two atomic concepts, it is possible to create a Connectionist network that could understand aRb without the ability to understand bRa. This, they argue, contradicts what we know of how thought works, consequently, Connectionism cannot explain cognition (at least, in this respect).
The defence, however dull, is a simple one. Fodor and Pylyshyn make the same localist assumption here that they made in trying to argue from productivity. Their argument that such a contradictory Connectionist net could be created is based on their assumption that semantic relations can only be expressed through a weighted connection between two localised nodes of semantic content. As has already been demonstrated however, even the most infallible argument from that premise is a failure as an attack on Connectionism as a whole, since it fails to scale up to all Connectionist networks.
The final argument is perhaps the simplest to disprove, since there is now empirical data to contradict it. Fodor and Pylyshyn’s third claim is that human thought possesses inferential coherence and neural networks do not. In a way, this is a sort of subset of the systematicity argument, which says that our inferential mechanisms (as a sort of relation of the kind described in the previous paragraphs) must remain consistent. To use Fodor and Pylyshyn’s example, our ability to infer P from P&Q&R should necessarily imply our ability to infer P&Q, or Q, or R from the same premises, that is, our inferential mechanisms should be consistent, or coherent (Fodor and Pylyshyn, 1988, pp. 46-48). The argument is that while you could create a neural net that possessed inferential coherence, you could equally well create one that lacked it.
Unfortunately, their argument on this point is crippled two-fold. First, from a theoretical standpoint, their argument is once again subject to the assumption that individual inferential units (i.e. P, P&Q, P&Q&R) are embodied in individual nodes. It should be noted now, after repeatedly seeing this pattern, that Fodor and Pylyshyn have indeed presented a very strong case against localist Connectionism, but again, their argument here fails to scale up. Their second challenge comes from the network described in Ramsey, Stitch and Garon, which, when given certain physical information about cats and dogs and fish, correctly inferred that cats have legs, but not scales, without having been exposed to that statement ahead of time (Ramsey, Stitch and Garon, 1990, p. 516). In other words, their network possesses inferential coherence.
In short, the Classical position presents some persuasive descriptions of mental processes, ones which I personally believe to be valid. What they fail to do, however, is successfully refute the Connectionist hypothesis. They have provided a strong argument for the case that Classical theory is a necessary component of an eventual theory of cognition, what they have not demonstrated is that it is sufficient on its own, for such a theory.
The mutual eliminativism debate is actually somewhat slanted. While there are some Connectionists that see no place for Classical theory, a large body of Connectionist literature is somewhat more defensive. Rather than outwardly attacking the Classical view, they are preoccupied with the (often monumental) task of defending their views from the onslaughts brought on by philosophers like Fodor. Still, any movement has its fanatics, and in considering this half of the issue, I will look principally at the work of Ramsey et al, who propose the conditional that if connectionism turns out to be valid, then it we ought to take an eliminativist stance towards ‘folk’ theories of human psychology and thought. The treatment will be considerably shorter than that of Fodor and Pylyshyn’s work partly because Ramsey et al’s claim is weaker, and partly because my reply is simpler.
What Ramsey et al argue, essentially, is that if they can demonstrate that Connectionist nets operate without employing Classical concepts about discrete representation and propositional attitudes, and if it ends up being the case that Connectionist theory is correct, then we can do away with the Classical notions of thought insofar as they not only contribute nothing, but are actually wrong,
[M]erely showing that a theory in which a class of entities plays a role is inferior to a successor theory plainly is not sufficient to show that the entities do not exist. Often a more appropriate conclusion is that the rejected theory was wrong, perhaps seriously wrong, about some of the properties of the entities in its domain, or about the laws governing those entities… (Ramsey, Stitch, and Garon, 1990, p. 501).
The argument begins by outlining what they feel is an adequate description of folk psychological theory. Specifically, they identify three properties of propositional (i.e. mental) states that they feel are essential to folk psychology, namely, that they are functionally discrete, semantically interpreted states with causal relations to other propositional states (Ramsey, Stitch, and Garon, 1990, p. 504). Functional discreteness describes the relatively comfortable hypothesis that we can lose or forget one propositional attitude (e.g. “My keys are in the kitchen”) without disturbing the rest of our attitudes, they are separate from one another. Semantic interpretability is exactly what it sounds like, the statement that thoughts possess meaning, that the propositional attitudes are referential to actual concepts or objects. Causal relation describes the fact that our beliefs and desires can interact, create, or alter other beliefs and desires – this is apparent to anyone who, upon observing (i.e. forming the attitude) that “the cat has walked out of the room,” suddenly finds their belief about “the cat is in the room” significantly altered.
The argument for the elimination of Classical architecture comes from their claim that they can design a neural network that performs the cognitive tasks associated with humans (in a drastically restricted problem space) while being incompatible with the three tenets of folk psychology previously identified. The crux of this argument actually comes down to the second property, semantic interpretability. By designing their network as a distributed, rather than localist net, the claim is that no semantic interpretability is “comfortable” (Ramsey, Stitch, Garon, 1990, p. 508).
Their pursuant demonstrations are everything they claim to be, the network does perform learning and reasoning tasks, and, if you accept their premise, it does so without appeal to semantic interpretability. So where, then, lies the counter argument?
It may already have become apparent that the counter to this proposal is very similar, almost identical, in fact, to Dennett’s instrumentalism (Dennett, 1987). That is, the Classical theory of cognition still works perfectly well and there is consequently no reason for rejecting it. Referring back to Ramsey et al’s comment, a theory cannot be rejected only because the successor is more accurate, it must be materially, demonstrably, wrong. While Ramsey et al contend that they can perform the same cognitive tasks without a semantically interpretable system, this demonstration does not adequately contest either the predictive or the explanatory power of the Classical theory.
What these Connectionist arguments from example demonstrate is that Connectionism is prima facie, a viable mechanism for cognition. There is not much argument, even from Classicists like Fodor, that neural nets of the kind described can and do perform many cognitive tasks. While there is argument as to whether they can be adequately scaled up to handle human-level cognition, the fact is that Connectionists have proposed a feasible mechanism for cognition. What they have yet to demonstrate that they have a monopoly on cognitive processes.
Having examined the failings of both philosophical extremes, it seems only
natural to attempt some resolution through combination. It is important to reiterate that what has been refuted thus far is not the validity of either school of thought, but the validity of the argument that, in each case, the other school must necessarily be wrong. Given this, what we would ideally like to describe is a metatheory of cognition that had, as different components of its explanatory power, the separate, but compatible notions of Classical and Connectionist cognition.
To develop this theory, I will make use of a highly over-used and often misconstrued concept known as levels of analysis. To avoid ambiguity and misinterpretation, let me say that I am using this term precisely as it was laid out by Marr, his interpretation being both unambiguous and best suited to our purposes here (Marr, 1982). Marr defines three levels of analysis at which a machine can be understood (particularly a machine carrying out information processing tasks):
1. Computational Theory: What is the goal of the computation, why is it appropriate, and what is the logic of the strategy by which it can be carried out?
2. Representation and Algorithm: How can this computational theory be implemented? In particular, what is the representation for the input and output, and what is the algorithm for the transformation?
3. Hardware Implementation: How can the representation and algorithm be realised physically?
What we must do now, to develop the metatheory, is outline a system under which both theories of cognition can co-exist.
Let us take Classical theory first, as its position is most obvious. No Classical theorist will argue, I think, the statement that the mechanisms proposed for Classical models are not at an implementation level – there is no hypothesis as to how such things as semantic relations or thought-sentences are represented in the human brain, and while higher order computer languages allow for such representations, few theorists would claim that our mechanisms of thought work on the same principles as LISP or PROLOG. Nor, though it may be a slightly less obvious transition, does Classical theory fit at the algorithmic level. While it’s true that Classical theory deals quite explicitly with representation, it does not consider implementation of its theories in a mechanical context, cannot do so, in fact, until it can mechanise meaning. No, the true place for a Classical theory is at the Computational Theoretical level. What Classic arguments describe are principles of cognition: productivity, systematicity, inference. They posit answers to the questions of “why is the logic of Classical strategy justified?” and “what is the appropriate viewpoint on the structure of thought?” Classical theory provides a fairly high-level description of cognition, with little consideration for the more technical or implementation aspects.
Connectionism, on the other hand, provides a wonderful counterpart to Classicism, in that the mechanisms it investigates, and the theories it supports are ones based on the specifics of implementation. Whereas Classical theorists have a large bulk of literature and debate, Connectionists have actual, working models of certain basic cognitive tasks. The question as to which level, 2 or 3, Connectionism falls under has a great deal to do with the type of Connectionism being discussed. Smolensky’s PTC nets are obviously at an algorithmic level, as he takes pains to distance himself from actual, neural level representation (Smolensky, 1988, p. 8). Others, like Ramsey et al. examine the issue from a more physical, implementation level, but in both cases, the results are compatible with higher-level, Classicist descriptions. In fact, Fodor and Pylyshyn don’t
deny this, they support the possibility of Connectionism as an implementation of Classical ideas (Fodor and Pylyshyn, 1988, pp. 64-66).
In the same light that physicists don’t describe a planet’s orbital velocity in terms of quantum level fluctuations, in the same way medical doctors can effectively treat a bullet wound without appealing to cell theory, description of cognition can happen independently, correctly, on multiple levels. There does not have to be an absolute truth of cognition. As this paper has shown, attempts to attack the validity of one level of analysis from the standpoint the other are futile – they do not demonstrate that one theory is right and the other wrong, they only serve to demonstrate that the two operate at different conceptual levels. In fact, it is arguable, as it has been argued here, that both theories are saying the same thing, merely in different language.
Chomsky, N. Language and Mind. New York: Harcourt, Brace and World, 1968.
Churchland, Paul M. Matter and Consciousness, Revised Edition. Cambridge, MA: A Bradford Book, The MIT Press, 1988.
Dennett, Daniel C. ‘True Believers’, in The Intentional Stance. Cambridge, MA: A Bradford Book, The MIT Press, 1987. 13-35.
Fodor, J. The Language of Thought. Sussex: Harvester Press, 1976.
Fodor, J. and Z. W. Pylyshyn.: 1988, ‘Connectionism and Cognitive Architecture: A Critical Analysis’, Cognition 28, 3-71.
Marr, D. Vision. San Francisco: W. H. Freeman, 1982.
McClelland, J. L., D. E. Rumelhart and G. E. Hinton: 1986, ‘The Appeal of Parallel Distributed Processing’, in Rumelhart and McClelland (1986a), pp. 3-44.
Ramsey, W., S. Stitch, and J. Garon: 1990, ‘Connectionism, Eliminativism, and the Future of Folk Psychology’, in J. Tomberlin (ed.), Philosophical Perspectives, Vol. 4, Ridgeview, Atascadero, California, pp. 499-533.
Rumelhart, D. E., and J. L. McClelland eds. Parallel Distributed Processing, 2 vols., Cambridge, MA: MIT Press, 1986a.
Smolensky, P.: 1988, ‘On the Proper Treatment of Connectionism’, Behavioral and Brain Sciences 11, 1-74.
 Actually, this is an exaggeration. There are some modern theorists, mostly Connectionist that support a compatibility hypothesis, see for example, Rumelhart et al, 1986.
 In Fodor and Pylyshyn, 1988, the argument is actually presented in four parts: productivity, systematicity, compositionality and inferential coherence. However, Fodor and Pylyshyn themselves acknowledge that compositionality is really subsumed by systematicity, making the argument (effectively) three part.
 As Fodor, 1976, points out, this is necessary, since the expression of each sentence must correlate with the speaker having a corresponding thought.
 Including, of course, such ‘folk’ notions as Fodor’s Language of Thought,
 Being conditional, not absolute.
 The arguments offered in support of this thesis are, in fact must be, relatively obvious, since the theory they are presenting for analysis is a “common sense” theory of psychology. Moreover, they are outside the scope of this paper.
 Though, of course, this is not to say that we can only form propositional attitudes about ‘real’ or ‘true’ concepts and relations. Elves riding unicorns are perfectly permissible.
 This is a dubious claim. While they consider several objections, they are unduly dismissive, talking about how it seems “highly improbable” that a discrete semantic representation could be found. A full treatment of their objections, however, is beyond the scope of this paper.
 And thereby solve the problem of mechanical reasoning, not an easy task.