I see how it can sound appealing to a bureaucrat, but as a programmer, debugging the concurrent evaluation of thousands of "natural" language IF...THEN... rules until I find the questionable one where a threshold was defined too low or too high sounds like a nightmare.
I imagine they would log all the inputs, as well as branches taken so they can later replay everything in a debugger. That would make the process much simpler.
Being able to take the blackbox recordings of your combat drone which got shot down, reconstruct the scenario and permute the rules till you get a win, seems like...well, a big win.
Air combat is also one of those areas which does notionally have narrowly computable victory parameters - given hardware of capabilities X, there is a model we don't know which should generally predict the outcome.