Research Project Name

A Cognitively Plausible Model of Child Language Learning: Categories, Agreement, and Morphology

Objective

The objective of this research project is to create a computational model of early child language that obeys cognitive constraints on first language learning. The input to the computational model is from the Roger Brown Corpus of recorded baby-mother interactions. The learning achieved by the computational model is compared with the learning achieved by babies as documented in the Brown Corpus.

Results

CAM (Categories, Agreement, and Morphology) is a computational model of  several important aspects of language acquisition. CAM is based on the Semantic Bootstrapping Hypothesis of Stephen Pinker. CAM respects widely accepted psychological constraints such as no negative evidence and no memory of previous inputs. CAM learns in a largely bottom-up manner, learning parts of categories fist, then context-free grammar rules based on these categories, and finally agreement rules on top of the context-free grammar rules.

CAM duplicates the partial order relations observed by Brown in children for the progressive, the plural, the third-person regular, and the auxiliary verbs.

CAM solves the negative evidence problem for agreement rule learning: though it receives no negative evidence in the input, it is nonetheless capable of providing both positive and (internally generated) negative examples to its built-in Boolean learning algorithm, which creates the agreement rules.

CAM learns parts of both English and Cheyenne, a highly morphological American Indian language. Procedures for syntactic category inference are described. An approach to integration of semantic bootstrapping and syntax-driven syntactic category inference into one system are also described. 

CAM shows how the form of X-bar Theory is influenced by acquisition, parsability, and syntactic category inference. Finally, the full range of grammars learnable by CAM is described in precise, mathematical detail. Three results are shown: (1) CAM’s correctness, i.e., its ability to correctly identify a target grammar from inputs based on that grammar, (2) its order invariance over different input orders, and (3) its robustness: the ability to correctly learn a target language from a vastly more complex input language.

A
 major open question in cognitive science is the extent to which language learning is the result of hard-wired syntactic rules, meta-rules, or parameters. The results of this work provide support for the Semantic bootstrapping hypothesis (Pinker 1984), which holds that (at least some of) the acquisition of syntax (and in this case, morphology) is semantics-driven.

Publications