x axis: Big model's CE loss on Base and Stateful models generated texts. y axis: Uncompressed to compressed size ratio (LZMA algorithm) This graph was obtained by tuning the softmax temperature of the generative models.
(Please use a modern browser to see the interactive version of this visualization)