Yes, I am aware of those. However, the kintex PCIe interface is a bit of a pain as it has a TLP straddling mode that can't be disabled, so it will be some time before it's supported as it will require some significant reworking in the PCIe interface modules. I am planning on supporting straddling eventually as this will improve PCIe link utilization on the ultrascale and ultrascale plus parts. If someone wants to donate a board, I can look in to supporting it.
Stradding is an artifact of very wide interfaces. On the Ultrascale+ parts, the PCIe gen 3 x16 interface comes out as a 512 bit wide interface. Every cycle of the 250 MHz PCIe user clock transfers 64 bytes of data. The issue has to do with how packets are moved over this type of interface. If your packets are all a multiple of 64 bytes, no problem, you get 100% throughput. However, if your packets are NOT a multiple of 64 bytes in length, you have a problem. What byte lane do packets start and end in? The simplest implementation is to always start packets in byte lane 0. The interface logic for this is the simplest - the packets always start in the same place, so the fields always end up in the same place. However, if your packet is 65 bytes long, the utilization is horrible - it doesn't fit in one cycle, so you have to add an extra cycle for every packet, and bus utilization falls to 50% as you have 63 empty byte lanes after every packet.
Straddling is an attempt to mitigate this issue. Instead of only staring packets in lane 0, the interface is adjusted to support starting packets in several places. Say, byte lanes 0 and 32. Or 0, 16, 32, and 48. Now, when you have a packet end in byte lane 0, you can start the next packet in the same clock cycle, but in byte lane 16 or 32. This increases the interface utilization. The trade-off is now the logic has to deal with parts of two packets in the same clock cycle, and it has to deal with multiple possible packet offsets.
The specific annoyance with PCIe packets is that the max payload size is usually 256 bytes, but every packet has a 12 or 16 byte TLP header attached, which really screws things up when combined with the small max payload size.