The Models You Can Run on a 12 GB GPU

For a batch size of 1 and a sequence length of 1024 tokens

(Please use a modern browser to see the interactive version of this visualization)

Get the dataCreated with Datawrapper