Past research on grammar induction has found promising results in predicting parts-of-speech from n-grams using a fixed vocabulary and a fixed context. In this study, we investigated grammar induction whereby we varied vocabulary size and context size. Results indicated that as context increased for a fixed vocabulary, overall accuracy initially increased but then leveled off. Importantly, this increase in accuracy did not occur at the same rate across all syntactic categories. We also address the dynamic relation between context and vocabulary in terms of grammar induction in an unsupervised methodology. We formulate a model that represents a relationship between vocabulary and context for grammar induction. Our results concur with what has been called the word spurt phenomenon in the child language acquisition literature.
|Title of host publication||Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference|
|Editors||William Eberle, Chutima Boonthum-Denecke|
|Number of pages||5|
|Publication status||Published - 3 May 2014|
|Event||Twenty-Seventh International Florida Artificial Intelligence Research Society Conference - Florida, United States|
Duration: 3 May 2014 → 3 May 2014
|Conference||Twenty-Seventh International Florida Artificial Intelligence Research Society Conference|
|Period||3/05/14 → 3/05/14|
Datla, V. V., Louwerse, M. M., & Lin, K-I. (2014). Part of Speech Induction from Distributional Features: Balancing Vocabulary and Context. In W. Eberle, & C. Boonthum-Denecke (Eds.), Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference (pp. 28-32). AAAI Press.