Abstract
Word identification from continuous input is typically viewed as a segmentation task. Experiments with human adults suggest that familiarity with syntactic structures in their native language also influences word identification in artificial languages; however, the relation between syntactic processing and word identification is yet unclear. This work takes one step forward by exploring a radically different approach of word identification, in which segmentation of a continuous input is viewed as a process isomorphic to unsupervised constituency parsing. Besides formalizing the approach, this study reports simulations of human experiments with DIORA (Drozdov et al., 2020), a neural unsupervised constituency parser. Results show that this model can reproduce human behavior in word identification experiments, suggesting that this is a viable approach to study word identification and its relation to syntactic processing.
Original language | English |
---|---|
Title of host publication | Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |
Place of Publication | Dublin, Ireland |
Pages | 4103–4112 |
Volume | 1 |
DOIs | |
Publication status | Published - May 2022 |
Keywords
- Word Identification
- Segmentation Task
- Artificial Languages