This reminds me of a similar problem observed in a distributed file system. Eventually it was tracked down to a single bad network card. Why do you think it was a kernel issue? Hardware has "bugs" too.
I don't know if it was a kernel issue, but I was able to reproduce the problem in another DC by setting somaxconn artificially low (to 2 IIRC) in order to artificially force Linux into SYN cookie mode, and that reproduced this issue as well.
Given my knowledge of TCP and SYN cookies, I was not able to come up with another theory for why individual TCP connections would be stuck with a tiny window, because it seems that the entire idea behind congestion control is to increase the window once the congestion is gone ... That's why I think it might be a kernel issue. Given the complexity of TCP, I'd say there is an at least equally high chance I just don't know enough ;)