Human Evaluation: GPT Win Rates (%) Based on Item Scores Per Language Pair

Figure 6 in the paper shows the results of the human evaluation of text-davinci-003 (%) per language pair.

Chart: Slator Source: Microsoft