Exactly, and this is pretty much the answer to the OP's question about why software engineers haven't embraced FPGAs.
Answer; there are other solutions that are simpler and much closer to their existing knowledge base. You want to build a soda machine? I can implement that logic in an 8 pin, fifty cent Atmel tinyAVR device that can be programmed in C or C++
Skills of software engineers are probably better applied to building better FPGA toolsets.
You can probably also implement it in a PLC, which costs more than fifty cents but already includes the relays that you need to drive the solenoids that drop the soda cans and the darlingtons you need to drive the relays. Overall it might be cheaper and easier to program. The US$100 Phoenix 2701043 is the low end here: http://www.digikey.com/product-detail/en/2701043/277-2648-ND...
You might reasonably ask why there aren't 8-pin fifty-cent FPGAs. I don't really know, but some PLDs (not PLCs) come pretty close; http://www.digikey.com/product-detail/en/ATF16V8B-15JU/ATF16... was what I came up with on a Digi-Key product index browse: the Atmel ATF16V8B-15JU-ND, 84 cents in volume, a 20-pin Flash PLD with 8 macrocells, each with a flip-flop, 8 I/O pins, 10 input pins, and 10ns pin-to-pin latency. You can probably do a soda machine with 256 states. But an ATTiny has at least 1024 bits of RAM, which is a lot more than 8, and can do much more complex logic; it's just hundreds of times slower than the PLD.
A little higher upmarket, there are CPLDs like the 24-pin ATF750LVC-15SU-ND, which goes for US$3.72 in volume, with 20 flip-flops (bits of memory), 10ns pin-to-pin delay, EEPROM, 171 product terms feeding into 20 sum terms, and 12 input pins and 10 I/O pins.
For a while there was a Spartan FPGA that went for US$5, but perhaps due to a lack of articles like Yossi's here, it didn't do well and seems to be out of production.
Basically I think that when we're talking about state machines that are constrained to operate at mechanical speeds, it makes more sense to use the microcontroller approach — when your response times are measured in microseconds or milliseconds instead of nanoseconds, when you really only need one addition or multiplication per clock cycle instead of 8 or 100, you might as well do them one at a time and spend the extra real estate on more programmability rather than more computational power.
On the other hand, there are lots of computational tasks where it really would be nice to be able to do more ops per cycle.
Answer; there are other solutions that are simpler and much closer to their existing knowledge base. You want to build a soda machine? I can implement that logic in an 8 pin, fifty cent Atmel tinyAVR device that can be programmed in C or C++
Skills of software engineers are probably better applied to building better FPGA toolsets.