Complexity of individual Chinese characters through time

The y-axis of each plot shows perimetric complexity. Our dataset includes an average of around 50 variants of each character for each script but only the median complexity variant is plotted here. Oracle, Bronze and Seal images are drawn from hanziyuan.net, Traditional images are drawn from the Traditional Chinese Handwriting Dataset (Chen, 2020), and Simplified images are drawn from the CASIA offline Chinese handwriting database (Liu et al, 2011).

English glosses are drawn from Google Translate and are unreliable in some cases, especially for characters that normally appear as part of multi-character words.

The radio buttons on the left allow you to select from different sets of Chinese characters. Pictographic characters have a form that depicts their meaning. Pictologic characters are often created by adding a stroke to a pictographic character to capture a more specific or more abstract concept: for example, 刃 (knife edge) is created by adding a stroke to the pictographic character 刀 (knife). Pictosynthetic characters are built from multiple pictographic characters that together form a new character. Pictophonetic characters combine a semantic part and a phonetic part. All character classifications are taken from the Chinese Lexical Database (Sun, C. C., Hendrix, P., Ma, J.Q. & Baayen, R. H., 2018).

The app on this page includes all 232 pictographic and all 48 pictologic characters in the Chinese Lexical Database (CLD), and the 1333 most frequent characters among the remaining characters in the CLD. If you would like to see results for a character in the CLD that's not included in the app, please contact Charles Kemp.

For more details about our analyses, please read the full paper and supplementary material here.