This is great and I feel weirdly relieved (considering I don't actually really gain anything from that).
That, on the other hand, makes me feel sad and almost depressed every time:
> It uses approximately 16 TPUv3s (which is 128 TPUv3 cores or roughly equivalent to ~100-200 GPUs) run over a few weeks, a relatively modest amount of compute in the context of most large state-of-the-art models used in machine learning today
That, on the other hand, makes me feel sad and almost depressed every time:
> It uses approximately 16 TPUv3s (which is 128 TPUv3 cores or roughly equivalent to ~100-200 GPUs) run over a few weeks, a relatively modest amount of compute in the context of most large state-of-the-art models used in machine learning today