I'm going to quibble with your last point. A decision tree can, absolutely class...

I'm going to quibble with your last point. A decision tree can, absolutely classify images. Depending on exactly how you specify the problem it can do so exactly as well as a neural network. It would just either be a lot less space efficient.

No free lunch isn't about choosing the right biases, it says that a single model cannot solve every problem. And in fact any solver will have accuracy equivalent to chance when averaged across all problems. Here's an example:

There are four possible one bit to one bit functions, I and ~ (the identity and inverse functions), as well as T and F (True and False). If you create a truth table for all four of those functions:

    func  input  output
    I     0    | 0
    I     1    | 1
    ~     0    | 1
    ~     1    | 0
    T     0    | 1
    T     1    | 1
    F     0    | 0
    F     1    | 0

You'll notice that the output column contains an equal number of one bits and zero bits. This is true for any input size on binary functions, and in general if you extend the definition from "one bits and zero bits" to "equal number of each possible output class" it is true for functions of any output size.

That's what the No Free Lunch theorem states: that across all possible functions in your space, you can do no better than chance, because for any function F that you approximate exactly, I can define a function "!F" (or in a non-binary space, a family of functions) which you are exactly 0% accurate on. This is true no matter the algorithm or assumptions you choose.

Similarly, I can define a classification algorithm that has no implicit assumptions about what I am classifying, other than that it can be classified, but still achieves 100% accuracy on any specific problem I apply it to:

For every possible input in your function space, get the ground truth label from some external function, throw those all into a mapping (a very, very large mapping) and then just look up the input in the mapping when requested.

For image classification, this means a mapping of all possible NxN images to a boolean: "Contains dog". You might even, as an optimization make this more space efficient by picking certain pixels that are more impactful and looking at them first, so that you don't have to store every image. That sounds an awful lot like a decision tree ;)

Tl'dr: 'No Free Lunch' has nothing to do with the efficiency of the encoding, but instead with the ability to have a perfect agent that can solve every problem.