A Turkish database for psycholinguistic studies based on frequency age of acquisition and imageability

Acar, Elif Ahsen
Zeyrek Bozşahin, Deniz
Kurfalı, Murathan
Bozşahin, Hüseyin Cem
This study primarily aims to build a Turkish psycholinguistic database including three variables: word frequency, age of acquisition (AoA), and imageability, where AoA and imageability information are limited to nouns. We used a corpus-based approach to obtain information about the AoA variable. We built two corpora: a child literature corpus (CLC) including 535 books written for 3-12 years old children, and a corpus of transcribed children’s speech (CSC) at ages 1;4-4;8. A comparison between the word frequencies of CLC and CSC gave positive correlation results, suggesting the usability of the CLC to extract AoA information. We assumed that frequent words of the CLC would correspond to early acquired words whereas frequent words of a corpus of adult language would correspond to late acquired words. To validate AoA results from our corpus-based approach, a rated AoA questionnaire was conducted on adults. Imageability values were collected via a different questionnaire conducted on adults. We conclude that it is possible to deduce AoA information for high frequency words with the corpus-based approach. The results about low frequency words were inconclusive, which is attributed to the fact that corpus-based AoA information is affected by the strong negative correlation between corpus frequency and rated AoA.
Citation Formats
E. A. Acar, D. Zeyrek Bozşahin, M. Kurfalı, and H. C. Bozşahin, “A Turkish database for psycholinguistic studies based on frequency age of acquisition and imageability,” Portorož, Slovenya, 2016, p. 3600, Accessed: 00, 2021. [Online]. Available: http://lrec2016.lrec-conf.org/en/.