The estimated costs of training a model once

In practice, models are usually trained many times during research and development.

Note: Because of a lack of power draw data on GPT-2's training hardware, the researchers weren’t able to calculate its carbon footprint.
Table: MIT Technology Review Source: Strubell et al.