ACQUISITION OF LANGUAGE.

The March 14, 1997 issue of Science has a series of articles on recent advances in psychology, including an article on the acquisition of language by children, and the comprehension and production of language by adults and children. The article is by Alan Prince and Paul Smolensky and is entitled “Optimality: From Neural Networks to Universal Grammar”.

Neural networks are computer designs which can learn. Between the input units and the output units are placed intermediate units with variable states. The computer network is first “trained” with some problems whose answers are known, thus producir.g an initial adjustment of the variable units. This serves as an approximation when the unknown problems are later introduced. The network is given feedback as to the correctness of its output, enabling it to make further adjustments, until it can produce mainly correct answers.

Neural networks were deliberately designed to simulate what were thought to be the cognitive mechanisms in the human brain. The usual Von Neumann linear computer design was thought not to resemble brain mechanisms too closely, since brain mechanisms are “massively parallel” rather than linear. It is inter~sting that now Prince and Smolensky try to apply the neural network computer mechanisms to help us understand what happens in the brain when we learn, comprehend, and produce language.

A basic model in linguistic theory has been that of Noam Chomsky, who postulated a “universal grammar” (sequence of words etc.) valid for all the hundreds of human languages. This universal grammar must therefore be innate in humans, he thought, i.e. genetically based. This has been questioned by subsequent writers. Most concede that there is a genetic component to the acquisition of human language, but perhaps it is not exactly the component postulated by Chomsky.

The basic idea of Prince and Smolensky is that children approach their knowledge of how to use language incrementally, just as neural computer networks do, learning by trial and error; the errors are pointed out by adults, and correct usage is reinforced by adult approval. This is called “optimization” in neural networks; it is a gradual approach, through successive approximations, to the best compromise between various contradictions. These contradictions arise when certain rules clash’ with one another, and some “metarules” are then needed to resolve the conflicts. This optimization procedure then results in the “universal grammar”. The rules are the same in all languages, but the meta-rules may differ from one language to another.

The meta-rules (my term, not used in the article) take the form of “strict dominance” (the authors’ term). This means the following: when two principles (or rules) clash, one of them becomes,dominant. The meta-rule of strict dominance then prescribes that no amount of success of the weaker rule wins over the failure of the dominant rule.

The usual clash occurs between “structure” (certain constraints of word-sequences in a well-formed sentence, e.g. subject-predicate-object) and “faithfulness” (favouring exact replication of input even at the cost of structural complexity). Different languages differ about which principle should be dominant, structure or faithfulness. (Concrete examples are needed here.)

The best balance between the demands of various constraints is attained in neural networks by adjusting the “weights” in the intermediate (variable) layer of units to a so-called “harmony pattern”, which in language formation is equivalent to a “well-formed sentence” in a particular language.

Optimality theory is a higher-level theory, like thermodynamics. Neural computation is like the underlying lower-level theory of statistical dynamics. The lower-level theory provides the mechanism which makes the higher-level theory valid.

Comprehension (interpretation) of language differs from language production. The former is much easier, and provides the initial “training” from which the child (like the computer network) derives the original “weights” in the intermediate layers. A child learns to speak by being spoken to; a fact generally known for a long time. For a time, children comprehend much more language than they can produce. The necessary feedback as to the language production then comes from classical conditioning: the approval or disapproval of adult listeners.

So what is innate and what is learned in language acquisition? Possibly the child’s brain contains some structure analogous to a neural (massively parallel) computer. This certainly is not too far-fetched, since neural computers were deliberately designed to simulate the brain. But the training and final harmonization and optimization of language production is learned from the interaction with adults.

In early child language production, “structure” constraints tend to predominate over “faithfulness” constraints: e.g. words are pronounced differently, but rules of sentence structure are observed. This is what had impressed Chomsky so much. But “improving” grammar means only re-ranking the constraints.

More advanced properties of linguistic structures, such as the nesting of syntactic phrases (subordinate clauses etc.) are learned much later, again by successive approximations and adult corrections. Syntax learning is later supplemented (and sometimes first introduced) in formal school classes.

I would like to add that, to me, grammatical-syntactic structures are structures of logic comparable in their beauty to structures in mathematics. It offends my sense of harmony to read a sentence without a verb, or a complex sentence ending with a subordinate clause without closure of the main clause. It has the feel of a melody without the closure of a final chord, or a mathematical proof without the final “ergo” statement. Omitting the teaching of grammar in primary schools is a profound mistake.

Hanna Newcombe

[ How Things Come Together > > Meaning ]