Do multi-sense word embeddings learn more senses?
Note that the English and the Hungarian variants of this page is not a strictly parallel corpus.
At last year's ACL, we (Borbély et al 2016) proposed a method for measuring the sense resolution of multi-sense embeddings (MSE) based on linear translation (Mikolov et al 2013) from the MSE to a plain embedding. The talk develops this method.
In the experiments reported in this talk two measures have been used. Meta-parameters (e.g. the target embedding) have been chosen based on the metric called lax: at least one sense vector should have a good translation. This measure does not punish different senses with the same translation.
To measure whether the different sense vectors really correspond to different senses, we take a slightly stricter measure: the sets of the good (gold) translations of different sense vectors should be different. The proportion of such source forms is computed among the words predicted to be ambiguous.
|lax||disamb||AdaGram||73.3%||18.53%||mutli “sense vectors”||71.0%||19.46%||mutli “context vectors”||69.9%||20.76%|
We found that there is a trade-off between the two measures, which could be interpreted as that the more specific a vector is, the easier it is to translate, but if the vectors are too specific, then the translations may coincide.
Future work: analysis of the number of word senses plotted against word frequency. Word ambiguity as a Dirichlet Process.