I don't disagree that the tools are rough, but to the person you're replying to's point, would perfect tools and languages actually solve the underlying problem?
As much as I love FPGAs, GPUs really ate their lunch in the acceleration sphere (trying to leverage the FPGA's parallelism to overcome a >20x clock speed disadvantage is REALLY hard, especially if power is a concern) and so it seems the only niche left for them is circuit emulation. Of course, circuit emulation is a sizable market (low volume designs which don't make sense as ASICs, verification, research, etc.) and so it's not exactly a death sentence.
The FPGA market has been growing in size despite GPGPU taking off. And clock speed difference is closer to 4-5x not 20x. Despite that and the lower area efficiency of FPGAs, there have been price and power competitive FPGA accelerators cards released over the last 5 years. Sure, you're not going to get an A100's performance, but you can get deterministic latency below 5us for something that the A100 would take a minimum of 50us to process. GPGPU isn't ideal for its current use case either so FPGA based designs have a lot of room to work in to get better, application specific accelerators.
As much as I love FPGAs, GPUs really ate their lunch in the acceleration sphere (trying to leverage the FPGA's parallelism to overcome a >20x clock speed disadvantage is REALLY hard, especially if power is a concern) and so it seems the only niche left for them is circuit emulation. Of course, circuit emulation is a sizable market (low volume designs which don't make sense as ASICs, verification, research, etc.) and so it's not exactly a death sentence.