Interesting, but wrong use of conditional probabilities. All OP is saying is that the frequency of people from school x following topic y is p.
The dataset is just not the right one to say that P(x|y) = p, because of, say, all the people who follow food in NYC and go to NYU which were not taken into account here.
The dataset is just not the right one to say that P(x|y) = p, because of, say, all the people who follow food in NYC and go to NYU which were not taken into account here.