Chinese lexical database (CLD): A large-scale lexical database for simplified Mandarin Chinese

Ching Chu Sun*, Peter Hendrix, Jianqiang Ma, Rolf Harald Baayen

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

57 Citations (Scopus)


We present the Chinese Lexical Database (CLD): a large-scale lexical database for simplified Chinese. The CLD provides a wealth of lexical information for 3913 one-character words, 34,233 two-character words, 7143 three-character words, and 3355 four-character words, and is publicly available through For each of the 48,644 words in the CLD, we provide a wide range of categorical predictors, as well as an extensive set of frequency measures, complexity measures, neighborhood density measures, orthography-phonology consistency measures, and information-theoretic measures. We evaluate the explanatory power of the lexical variables in the CLD in the context of experimental data through analyses of lexical decision latencies for one-character, two-character, three-character and four-character words, as well as word naming latencies for one-character and two-character words. The results of these analyses are discussed.

Original languageEnglish
Pages (from-to)2606-2629
Number of pages24
JournalBehavior Research Methods
Issue number6
Publication statusPublished - 1 Dec 2018
Externally publishedYes


  • Chinese lexical database
  • CLD
  • Lexical database
  • Mandarin Chinese
  • Simplified Chinese


Dive into the research topics of 'Chinese lexical database (CLD): A large-scale lexical database for simplified Mandarin Chinese'. Together they form a unique fingerprint.

Cite this