Mullvad does have the happenstantial advantage that its userbase likely nowhere near as diverse as Google's, naturally following that the queries themselves are not as diverse. While Google fields requests across the full diversity of the globe, Mullvad's userbase likely skews toward middle-high income westerners with a STEM background searching in English. The types of queries these users are making are probably from a much narrower corpus of topics; I wonder what percentage of the queries revolve around privacy, Linux, software, typical hacker hobbies like woodworking, et cetera. This isn't to say that these are the only types of queries being made, but if you were to group Mullvad users into equivalently broad advertising cohorts, you'd probably end up with far fewer than Google's users.
The interests being more heterogeneous results in more similar queries, which would increase the proportion of cache hits. Whether this is enough to help make the strategy viable is another matter, but I do think it's worth noting.
I also wonder about the complexity of the queries themselves. The more technical users would probably use more complex combinations of operators, but they're also more likely to search by keyword rather than natural language.
But people who actively use VPNs are not necessary those with a search history that follows a short tail distribution. Mullvad gets a good chunk of its revenue from Firefox and other white labels too.
The interests being more heterogeneous results in more similar queries, which would increase the proportion of cache hits. Whether this is enough to help make the strategy viable is another matter, but I do think it's worth noting.
I also wonder about the complexity of the queries themselves. The more technical users would probably use more complex combinations of operators, but they're also more likely to search by keyword rather than natural language.