Keyword Explorer

This app allows you to compare frequencies of corresponding words across languages. The tabs at the top left allow you to toggle between WorldLex data and association data from the Small World of Words. The radio buttons at the left allow you to focus on the revealing and unrevealing keywords used to evaluate the model, or to explore results for all meaning classes in the data set.

Typing in the "Choose a meaning:" box will allow you to rapidly find a meaning you might be looking for. For the WorldLex data, some interesting cases to explore include "cabbage", "noodle", "pancake", "sausage", "heart", "mind" and "soul". For the association data, you might try "bike", "teacher", "love", "hate", and "freedom."

For each meaning plotted the numbers along the vertical axis show the ranks of each (language, meaning) pair. Pairs ranked in the top 3% are shown using red bars, and blue and green denote usage and association data respectively. The association data are normalized so that the sizes of the corpora for all languages are identical, which means that the ranks are predictable from the relative frequencies plotted (e.g. the tallest bar will have the lowest rank). The WorldLex corpora are not normalized in this way, which means that the ranks are not always equivalent to the ranks of the relative frequencies (e.g. for soul, Russian has rank 2 according to the Bayesian model but has rank 4 based on the raw relative frequencies).

All mappings between words across languages were generated by the automated pipeline described in our paper, and all omissions, errors and inconsistencies can be attributed to this pipeline and the resources (e.g. the bilingual dictionaries) that it uses.