Tensorflow datasets are pretty great, once you get them really rolling. They do ...

lumost · on April 9, 2021

aye - the above job was done with TF datasets. They are limited in that the python gil requires multiprocessing, multiprocessing involves serialization in python, serialization involves dealing with the rather extreme object overhead in python ( 24 bytes for an int! ).

Which all means there's a bunch of CPU bound stuff between your job and the GPU/Cuda kernels. How fast your app can deal with the above will influence overall GPU utilization.

FridgeSeal · on April 9, 2021

While I appreciate that Python is easy and flexible enough to write, it bugs me immensely every time I run into situations like this.

We go to all this effort to build and write stuff in this optimising, fancy framework only for the whole process to be bottlenecked by some silly performance limitation in Python.

sdenton4 · on April 9, 2021

It's usually possible to sidestep the python limitations with a bit of elbow-grease. The usual killer for performance is tf. py_function, which does indeed have to respect the GIL. If you can work out a nice way to handle your data without it, it should be able to stick to cpp in the backend and avoid the GIL. (So, data in a format that tf has a parser for, and write transformations using tf methods where you can.)