The estimated costs of training a model once
In practice, models are usually trained many times during research and development.
Note: Because of a lack of power draw data on GPT-2's training hardware, the researchers weren’t able to calculate its carbon footprint.
MIT Technology Review
Strubell et al.