A point about gradient-free methods such as simulated annealing and genetic algorithms: the transition (sometimes called "neighbor") function is the most important part by far. The most important insight is the most obvious one in some way: if your task is to search a problem space efficiently for an optimal solution, it pays to know exactly how to move from where you are to where you want to be in that problem space. To that point, (the structure of) transitions between successive state samples should be refined to your specific problem and encoding of the domain in order to be useful in any reasonable amount of time.
> the transition (sometimes called "neighbor") function is the most important part by far.
And, indeed, in the 0-1 integer linear programming with Lagrangian relaxation I used there is nothing differentiable so should be counted as "gradient free". And the linear programming part and the Lagrangian part do "move" from where are to closer to "where want to be".
A thing is, the bag of tricks, techniques, that work here is large. So, right, should use knowledge of the real problem to pick what tricks to use.