It really depends on the type of things that you do - if your network is deep and has a lot of matrix multiplication, GPUs definitely do speed things up. Libraries like cuDNN have built in optimized convolution ops that will also make convolutions a lot faster.
In my experience (not tf related, I mainly work on my own library now: https://github.com/chewxy/gorgonia) even with a cgo penalty, deep networks do improve with GPU training. Never dabbled much in CNNs (convolutions tend to do my head in) so can't say much.
In my experience (not tf related, I mainly work on my own library now: https://github.com/chewxy/gorgonia) even with a cgo penalty, deep networks do improve with GPU training. Never dabbled much in CNNs (convolutions tend to do my head in) so can't say much.