1) Indeed, GLNs don't learn features... but I would claim they do learn some not...

fxtentacle · on June 16, 2020

2) The problem that I would expect with a hybrid method is that conv features are usually trained to be redundant with dropout, so they should be highly correlated with each other and, thus, have a high cosine similarity.

3) I agree that my argument is scientifically unfair. I was trying to argue from the perspective of a prospective user. My customers tend to have a budget limit of how much their classifier is allowed to cost. Training from scratch would be too expensive. But a chopped reset with some conv layers will work OK and be cheap enough.

So for me the user, the ecosystem around your architecture and the availability of pretrained models might make the critical difference on whether I'll use it or not.

BrokrnAlgorithm · on June 15, 2020

Good point as well - sometimes its not about dimensionality reduction but more about persistent representation, having this geared towards highly non-stationary environments is nice thing to have.