Trying to disentangle Epistemic from Aleatoric Uncertainty in LLMs
Mathias Vigouroux
SZTAKI
Having the ability to explain its uncertainty is a key, yet missing capability in LLMs. One quite straightforward idea is to look at the entropy of the next-token prediction distribution. A flat distribution means that the model is unsure, predicting uniformly all the tokens of the vocabulary, while a peaked one might be synonymous to a confident prediction. However, this might fail at disentangling two distinct form of uncertainty: the epistemic and aleatoric one. Epistemic uncertainty arises from lack of knowledge and can be reduced with more information, while aleatoric uncertainty stems from inherent randomness and cannot be reduced. The aim of the presentation would be to present the different existing techniques in LLMs uncertainty quantification and try to raise a hypothesis on what should be done to disentangle epistemic vs aleatoric uncertainty.