The distribution of languages in Llama 2's corpus, subsetted to those found in more than 0.005% of the documents.
(Please use a modern browser to see the interactive version of this visualization)