The delay is just 10ms and this is a bootloader code that runs only once at bootup. As a bootloader for an MCU, a less error-prone way with less code size is more preferrable.
I can agree that justification to use nop loops somewhere along the lines of "core starts executing code when, judging by real-world testing, hardware peripherals do not guarantee stable state. Thorough testing suggests that 8ms+safety margin delay mitigates issues related to hardware readiness. 2500 cycle nop-loop guarantees required 10ms delay on fastest clock speeds and 50ms delay on slowest core speeds without causing observable issues. 50ms is hereby deemed acceptable." would acceptable in most commercial design documents.