In the CV department, I recently ordered a cheap FPGA + ARM Cortex-M3 + 64 Mbit SRAM + 32 Mbit flash that does camera input and HDMI output. Like a budget Zynq for CV.
There's absolutely no reason ROMs have to waste scarce resources of a hybrid FPGA. Micro SD cards (called TF in China) and eMMC are the usual solutions.
https://wiki.sipeed.com/hardware/en/tang/Tang-Nano-4K/Nano-4...
https://www.aliexpress.us/item/3256806880637138.html